The ACLU recently posted the results of an experiment they ran with Amazon’s face recognition engine (called Rekognition, presumably to help SEO). I have been wondering why so many organizations are going after Amazon when there are dozens of other vendors providing face recognition as a service. I suspect the reason Amazon has gotten this negative attention is because they have been actively promoting the system to law enforcement for watch list applications.
Basically, they found a public database of 25,000 criminal mugshots and then used Amazon’s Rekognition to search 535 photos of members of Congress, using photos readily available on the Internet. This is somewhat representative of Congress going to a baseball game and getting checked against a 25,000 person watch list.
Example Congressional face images used in the ACLU experiment
In fact, the ACLU’s experiment was conducted under more ideal conditions than you would see in a real life application – the congressional photos are well-lit and composed compared to what you would see from normal people walking into a stadium. So it’s reasonable to think these results are better than what you would get in a real world scenario.
They found 28 false matches, and also found people of color to be more likely to be falsely matched. I have no doubt that they very intentionally used Congressional photos for probes because they wanted to provoke a reaction. And sure enough, it worked! Congress is now calling on Amazon to testify.
But we want to talk about whether this result is as bad as it sounds. They tried to match 535 faces, and ideally would have no matches in the criminal database. (Please, no jokes, I’ve read the same joke 1,000 times now). They got 28 false matches. That works out to roughly a 5% false match rate. In case it isn’t obvious, that’s terrible. In a 100,000 person stadium, a 5% false match rate means the police would detain 5,000 people. They would probably have to rent a gym just to process them all.
But what a lot of these articles don’t mention is that any watch list application is going to have a fair number of false positives. That’s because of something called the paradox of the false positive. If you have a watch list at a stadium (for example), the vast majority of people going into the stadium are not going to be on the watch list. What this means, paradoxically, is that the matches that you do make are probably incorrect. It’s a very difficult problem, because you are trying to find something that doesn’t happen very often (medical tests for rare conditions have the same problem).
If you have a 100,000 person stadium, and a face recognition system that is correct 99.99% of the time you will still have 10 false matches. There is just no getting around this. In a medical test for a rare condition, doctors will often scare something like 999 people with a false test result saying they have a rare disease for every 1 person who actually has the disease. Doctors know about this and are careful to to offer advice and counsel to their patients on how to deal with the test results. They normally lead to additional testing or secondary verification.
But does that apply to a police watch list at a stadium? Can we detain/inconvenience/accidentally arrest 999 people in order to find 1 guy? That’s for society to decide, but personally I agree with the ACLU that this goes against the principles this country was founded on. But in any case, as long as we try to use face recognition technology for these types of watch list applications, false matches are just going to be part of the picture.