Captcha Love 'em or Hate 'em

It has become de rigeur to use Captcha forms on sites, the majority of which you encounter are of an image that you have to read the text of of and enter into a box.

Their necessity is mainly because of a class of Internet scum known as spammers. These people(???) are paid to try and post adverts for everything from selling PPI claims to Viagra on websites. 99% of it is automated using what are called bots. A bot is a robotic bit of code that crawls the web and identifies sites that have comment fields, web forms or account registrations that the bot then tries to use to either post the spam directly into or create an account to create spam content.

A Captcha aims to thwart these bots using an image (which bots cannot read, or hopefully cannot read) which requires the text shown in the image to be entered. But for the average user these are almost always awkward, the images are unclear and confusing. I would love to know the human success rate for CAPTCHA and RECAPTURE but that data is not easy to come by.

I am not a fan of image based CAPTCHAs I find my own success rate is somewhere less than 50%. So I tend to avoid using Image based CAPTCHAs wherever possible on sites I build, (of course some clients require them as mandatory which is fair enough and their choice). So I started looking for alternatives.

I tend to use two alternatives:

Text Captchas where the user is asked a question that whilst it may take two or three readings to understand is logical and defeats bots by using the kind of rational logic that no amount of logical algorithms will ever defeat. The classic is "What colour is blue?" The answer is so obvious (you enter "blue" as the answer) but no bot would make sense of it. Unless you have a bot with a very advanced artificial intelligence that can read the question and understand what the answer is then the question will be defeated. It is interesting to note from my logs that this method has only been defeated twice on two websites I have implemented it and in both cases I am certain it was a human entering the response. How am I sure, well I add a second method to help defeat bots.

The Honeypot field. Bots read the HTML of a page and look for form input elements (for the unititiated a form element is one of those boxes or pull down lists or buttons or checkboxes on forms on webpages) and will complete all it finds. The Honeypot field adds a field that is hidden from the web browser of human readers. You and I do not see the form element, it is not mandatory and we don't have to fill it in. But a bot sees it, thinks oh Goody another field I can fill in and enters either garbage (usually) or more considered text in it then submits the form. Ha Ha Gotcha. all the code has to do is test whether that field is completed and reject the form (and ideally block the IP of the sender). Pretty simple and an old idea that still works. Some caveats though, this mechanism might also defeat screen readers used by blind of visually impaired users but still better than image CAPTCHAs (although recaptcha does include an audio version of what needs to be entered), and you need to change the field name occasionally, especially if a bot ever gets through.

So if you use Text Captcha and it is successful and a honeypot field and that is also successful and it is a spammer, you have what is called a human being, (fickle buggers at the best of times).

Add new comment