What are CAPTCHAs and why do we need them?

Published on August 31, 2023

With the proliferation of the internet came the menace of bots. CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) were introduced as a countermeasure to differentiate between genuine human users and automated scripts (bots). This article explores how CAPTCHAs work, the difficulties in designing an effective CAPTCHA, and how they must evolve to stay one step ahead of bots while protecting our privacy.

What do CAPTCHAs protect us from?
Types of CAPTCHAs
How do CAPTCHAs work?
Criticisms
The future of CAPTCHAs
Summary

What do CAPTCHAs protect us from?

CAPTCHAs are a vital part of internet security, protecting against:

Brute force attacks: Without CAPTCHAs, bots can repeatedly attempt to log in to websites, cycling through countless combinations of usernames and passwords until they gain access.
Form spam: Bots can submit forms on websites, such as contact forms or comment sections, with spam content. CAPTCHAs prevent this by requiring a human-like interaction before form submission.
Web scraping: Some bots are designed to scrape or steal content from websites. CAPTCHAs can deter these bots, protecting website content.
Preventing fake sign-ups and account creation: On many platforms, creating accounts en masse is beneficial for spamming or other malicious activities. CAPTCHAs ensure that every account creation requires a human verification step, making mass account creation inefficient for malicious actors.
Protecting application resources: Bots can repeatedly access a website, consuming significant server resources and slowing down the site or even causing it to crash. By serving as a first line of defense, CAPTCHAs help ensure that only genuine users consume these resources.

CAPTCHAs serve as gatekeepers on the web, filtering out automated threats while allowing genuine human users to proceed. They play an indispensable role in protecting online platforms from many potential threats and abuses.

Types of CAPTCHAs

There are many types of CAPTCHAs, each relying on a specific interaction that’s easy for a human to perform but nearly impossible for a bot.

What are CAPTCHAs and why do we need them? | Proton (1)

Text-based CAPTCHAs: These display distorted letters and numbers you must identify and type out. The distortions are made in a way that machines find hard to recognize, but humans can decipher with relative ease.
Image CAPTCHAs: You are presented with a series of images and asked to select those that match a specific description (for example, “Select all images with traffic lights”).
Math CAPTCHAs: These show simple math problems that you have to solve, like basic addition or subtraction.
Time CAPTCHAs: These challenges are as simple as reading the time on an analog clock.
Interactive CAPTCHAs: Tasks like dragging and dropping items or following a simple instruction, for example, “Slide to the right.”.
Behavioral CAPTCHAs: These look at behavior such as mouse movement and past activity to detect bot-like behavior from page load.

In addition, audio CAPTCHAs are typically provided to help the visually impaired solve the challenge. You listen to a series of spoken letters or numbers and then type them out.

Moreover, a relatively new type of CAPTCHA has been developed in recent years called a “cryptographic CAPTCHA” where some basic computational challenge can be solved, also known as proof of work. With such a mechanism, the browser is given some challenges of adjustable difficulty to solve. The browser must provide an answer before it can proceed.

For example, the leading zero challenge requires your computer to find an input value that, when hashed, produces an output with a specific number of leading zeros.

Recently, mCaptcha(new window) and Friendly Captcha(new window) have emerged in this space. However, relying only on computational challenges is a risky strategy. While these challenges are unintrusive, they depend on your device’s computing ability. If your device is too slow, the user experience can be suboptimal, as you might have to wait many seconds for the challenges to complete. On the other hand, powerful servers used by a spammer would have no difficulty solving these challenges relatively quickly.

This illustrates the conundrum posed by CAPTCHAs: developers need to devise challenges that are difficult and costly for attackers while still being relatively simple for ordinary users.

How do CAPTCHAs work

Irrespective of the type of CAPTCHA being served, one universal truth holds true: The front-end client can know nothing about the solution to the CAPTCHA itself — otherwise it would be too easy for an automated solver. A server will typically send a challenge, the front end will provide the mechanism to enter the answer, and a server will validate the answer from the client.

For example, the overall flow for a text-based CAPTCHA will look something like this:

What are CAPTCHAs and why do we need them? | Proton (2)

Automated solvers(new window) can step in here and translate the CAPTCHA image the server generates to text input. This is why text-based CAPTCHAs have evolved to have more and more difficult patterns to solve, often making it difficult even for humans and inaccessible to those who are visually impaired.

Criticisms

While CAPTCHAs are crucial for online security, they are not without criticism.

Some say that CAPTCHAs are of little use in this era with machine learning and human solver services acting as a bridge between automated bots and CAPTCHA-protected websites. Such solver services employ real people to manually solve CAPTCHAs that a computer finds hard to decipher.

However such critics neglect the fact that the CAPTCHAs are in fact still doing their job by making it harder for attackers to spam a service. Even if a CAPTCHA can’t entirely prevent bots from spamming a service, it does make it much more difficult and that is oftentimes sufficient for the reduction of abuse.

Others argue CAPTCHAs can hamper user experience, especially if they’re too challenging. There are also accessibility concerns, since some CAPTCHAs can be difficult for users with visual impairments. As is often the case, it’s a balance between ensuring your privacy, protecting your security, and offering a user-friendly experience.

What are CAPTCHAs and why do we need them? | Proton (3)

In recent times, initiatives have focused on maximizing user experience at the expense of privacy by using browsing history to determine if you’re an authentic human being or a bot. A real person is likely to have activity on many different websites over the course of a day and has likely hit CAPTCHA systems before. This history gives systems such as hCAPTCHA or reCAPTCHA the ability to determine whether online behavior is authentic or not before you even load a page. This is why you just often simply click a checkbox instead of solving a real challenge.

While convenient, these challenges often compromise your privacy. These services inevitably know your browsing behavior and the sites you’ve visited which is a concern.

On the other hand, systems such as mCaptcha and Friendly Captcha offer more privacy but compromise security since proof-of-work systems only add a cost to the action and typically will not be effective at preventing bots from accessing your site or posting spam.

The future of CAPTCHAs

The landscape of machine learning and AI is rapidly evolving. What is challenging for computers today might become trivial tomorrow as models become more sophisticated.

We clearly need to move towards usable CAPTCHA systems that respect user privacy, and secure sites from the majority of bot and spam activity.

First we can help improve usability and accessibility by minimizing the number of CAPTCHAs for real users across the web. This is possible by leveraging protocols such as Privacy Pass(new window) which allow good users who have already completed CAPTCHA elsewhere to skip a CAPTCHA on another website for instance. This is performed without knowledge of previously visited websites. So why is it not more popular? Unfortunately current implementations of this protocol require the use of browser extensions which are not available across all browsers, and not installable by all users.

Second, we need systems that counter the threat of AI whilst not making CAPTCHAs prohibitively hard for humans. A recent paper by Searles et al, 2023(new window) showed that bots are already more accurate than humans in solving many of the leading CAPTCHA systems.

A solution resides in thinking of CAPTCHA challenges that could exploit the current limitations of AI such as “Contextual” CAPTCHAs that are difficult for machines to solve (as of the time this article was written) since they require world knowledge and complex systems to decode. They therefore present opportunities to stay ahead of this ever-evolving cat-and-mouse game. Their difficulty resides in the following properties:

They require multi-modal reasoning to solve: The CAPTCHA isn’t just about object recognition. It combines object recognition with contextual reasoning.

They are dynamic and varied: There can be numerous variations of questions and image combinations, making it difficult for a model to train specifically against such CAPTCHAs.

They tap into world knowledge: This approach leans on general world knowledge and common sense, areas where machines can still falter compared to humans.

For example, instead of using images, CAPTCHAs could create simple interactive challenges that require reasoning. For example, “Drag the moon below the cloud” on a canvas where various objects (like stars, sun, birds, etc.) are present. Or, in the example above for the interactive CAPTCHA, you’d need to drag milk to the fridge.

Another example is to present a very short story (a few lines) and ask a question based on it. For example:

Question: Andy went to the orchard and picked 3 apples. He ate 1 and gave 2 away.

Ask: “How many apples did Andy pick?” or “How many apples did Andy eat?”

Or one can show a series of images and ask questions based on common sense or contextual knowledge. For example, you could show the following images:

What are CAPTCHAs and why do we need them? | Proton (4)

What are CAPTCHAs and why do we need them? | Proton (5)

What are CAPTCHAs and why do we need them? | Proton (6)

What are CAPTCHAs and why do we need them? | Proton (7)

And then ask, “Which one can typically speak when they grow up?”

Humans can easily recognize the answer is the baby. But recognizing the context and the commonsense reasoning can be challenging tasks for a machine, even if it identifies all objects correctly.

Contextual CAPTCHAs are a very interesting future research area, but they also present several challenges:

Cultural bias: What’s considered “common sense” in one culture might be unfamiliar in another.
Language: For story-based CAPTCHAs, one needs to invest time to ensure that content is internationalized for non-English speakers. For example, in the “Drag the milk to the fridge” challenge, MILK would need to be translated to other languages to be utilized across the world.
Challenge generation: Creating challenges like these is not a trivial problem, and better solutions may require significant investments in time to get right.

In addition, it’s inevitable that someday, AI systems capable of multi-modal reasoning will be developed, and they may be better at solving these types of CAPTCHAs.

Summary

CAPTCHA systems continue to serve as a vital frontline defense against bot activity and spam attacks across the internet. While their presence is ubiquitous and enduring, the landscape is evolving. Recent breakthroughs in machine learning technologies, coupled with the emergence of CAPTCHA-solving services that employ human solvers, have begun to erode traditional CAPTCHA systems’ efficacy. Nonetheless, most attacks websites face are less sophisticated, and CAPTCHA systems remain a highly effective barrier.

As we move forward, the challenge lies not only in maintaining the robustness of CAPTCHA systems against increasingly sophisticated attacks but also in ensuring these systems are user-friendly, accessible to individuals with disabilities, respectful of user privacy, and free of undue friction or inconvenience for genuine users.

The future of CAPTCHA, therefore, calls for thoughtful innovation and rigorous research. It demands the development of new systems that can adeptly balance security with usability, accessibility, and privacy — embracing a holistic approach that evolves in tandem with the shifting tactics of malicious actors. As the digital world continues to grow and transform, CAPTCHA systems will undoubtedly need to adapt and innovate to maintain their role as a cornerstone of online security.

FAQs

What are CAPTCHAs and why do we need them? | Proton? ›

A CAPTCHA test is designed to determine if an online user is really a human and not a bot. CAPTCHA is an acronym that stands for "Completely Automated Public Turing test to tell Computers and Humans Apart." Users often encounter CAPTCHA and reCAPTCHA tests on the Internet.

Show Me More ›

What is CAPTCHA and why do we need it? ›

CAPTCHA helps protect you from spam and password decryption by asking you to complete a simple test that proves you are human and not a computer trying to break into a password protected account.

What is the original purpose of CAPTCHA? ›

CAPTCHA, a visual interface feature, or code, to stop automated computer programs, known as bots and spiders, from gaining access to websites.

See Details ›

What are examples of CAPTCHA? ›

The letters or numbers are designed to be difficult for computers to recognize but easy for humans to decipher. Examples include Google's reCAPTCHA, which features distorted letters and numbers, and Cloudflare's CAPTCHA, which includes simple arithmetic problems.

See Details ›

What are the pros and cons of using CAPTCHA? ›

Captchas protect web applications from harmful access by bots and spammers. However, the extra security comes at the cost of disadvantages in terms of accessibility and usability. The small image and audio puzzles are also a hurdle for some human users.

See Details ›

Can CAPTCHA be hacked? ›

CAPTCHAs are generally safe, but they can be hacked. CAPTCHAs help prevent bots, including malicious ones, from accessing sensitive sections of a site or generating spam messages.

Learn More ›

Why does it keep asking for CAPTCHA? ›

That means it's a browser issue, that is your browser is not sending enough important data to tell the system it's a legit user. Instead the system think it's a bot or spam to try to stop you with captcha.

Show Me More ›

Why does CAPTCHA even exist? ›

Find Out More ›

Does CAPTCHA look at your history? ›

CAPTCHA does not directly check the user's web history. However, some CAPTCHA implementations may use browser fingerprints or other data available to the website to determine the user's identity and assess their risk level.

Read On ›

Is CAPTCHA owned by Google? ›

reCAPTCHA Inc. is a CAPTCHA system owned by Google.

Show Me More ›

Is CAPTCHA a security risk? ›

CAPTCHAs risks can contribute to client-side attacks

In addition to the issues associated with user frustration and disengagement, CAPTCHA technology can also contribute to client-side website attacks.

Read On ›

How to answer CAPTCHA correctly? ›

Recognize patterns in the images presented, such as common shapes or colors. Some CAPTCHAs rely on specific visual elements that can guide your selections. Zoom in on the image to reveal finer details that might be crucial for accurate identification. This helps in distinguishing between similar-looking elements.

Find Out More ›

What triggers CAPTCHA? ›

Some common triggers include: IP Tracking: A user's IP has been identified as a bot. Resource Loading: A user doesn't load styles, banners, or images. Sign in: The user isn't signed in to Google/Gmail when accessing the site.

Learn More Now ›

Why not to use CAPTCHA? ›

Traditional CAPTCHAs are siloed, inaccessible, not privacy compliant, not secure, and not user friendly, leaving you and your users exposed to many dangerous bot threats.

Find Out More ›

What is the real purpose of CAPTCHA? ›

CAPTCHA stands for the Completely Automated Public Turing test to tell Computers and Humans Apart. CAPTCHAs are tools you can use to differentiate between real users and automated users, such as bots. CAPTCHAs provide challenges that are difficult for computers to perform but relatively easy for humans.

See Details ›

Why are CAPTCHAs necessary? ›

Which purposes does CAPTCHA serve? CAPTCHA prevents spam in website comment sections and on blogs. Many spammers bombard comment sections with links to increase search engine rankings.

Can you avoid CAPTCHA? ›

One way you can prevent CAPTCHAs is to choose a VPN that offers dedicated IPs. When you get a dedicated IP, no one else has access to the IP. This prevents CAPTCHAs because it removes the red flag of your IP constantly changing and performing multiple simultaneous requests.

Learn More ›

What information does CAPTCHA collect? ›

Types of Data ReCAPTCHA Collects:

Behavior, like scrolling on a page, moving the mouse, clicking on links, time spent completing forms, and typing patterns. Browser history. CSS information.

What are CAPTCHAs and why do we need them? | Proton (2024)

What do CAPTCHAs protect us from?

Types of CAPTCHAs

How do CAPTCHAs work

Criticisms

The future of CAPTCHAs

Summary

FAQs

What are CAPTCHAs and why do we need them? | Proton? ›

Is CAPTCHA a security risk? ›