( Not a native English speaker)
There are rumours CAPTCHAs are actually us humans feeding the ML, but how do the servers detect my small mistakes if I’m the one who’s training them ?
If the CAPTCHA Answers were already written by the humans behind the systems why wouldn’t they just feed that to the ML or instead of relying on us ?
In: Technology
A CAPTCHA is a challenge generated by a computer designed to be hard for computers to solve but easy for a human to solve. But a company called reCAPTCHA now owned by Google invented the use of CAPTCHA to help generate learning sets for machine learning. It is not just a rumor, it is their entire business idea. The concept is that you are given two challenges. They already know the answer to one of them based on previous answers but they do not quite know the answer to the second one. But you do not know which one is which. If you answer correctly to the one they did know the answer to they will add your answer to the second challenge to their database. When enough people have been given this challenge they will compare these answers and the most common answer will be considered correct.
Lets say you have 100,000 photos taken of intersections, You start by taking 500, and manually marking which ones have pedestrians, cars, bikes, buses, traffic lights, etc. When someone requests a captcha, you send them 16 photos, some you’ve already marked, some you haven’t. If they get the ones you already know correct, you let them proceed, and you keep track of the results you don’t already know.
Each unknown photo gets sent to multiple people, and if a lot of people that correctly identified other images also think this one is a bicycle, you can add it to your list of known bicycles.
A few billion captchas later, you have 100k confidently classified images with which to train or test the image recognition algorithms of self driving cars.
The joke is that there is no machine learning, if you don’t identify the pedestrian in time, the car will run him over.
Latest Answers