Blink Identity - High throughput, privacy preserving identification service.

View Original

Everything You Always Wanted To Know About Face Recognition (but were afraid to ask)

What is Face Recognition?

Face recognition is the biometric identification of a person by comparing a live face image with a stored, or ‘enrolled’ image for that person. There are two steps for any biometric identification system.

The first is enrollment, where a person submits an image and supporting information to make an identity claim. (I am Mary. This is my picture.) The image is processed and converted to a mathematical template which is a unique numerical representation of the image. The important things to know about templates is that a face image will always match a specific template but a template cannot be turned back into an image.

The second step is matching. A person submits an image and it is converted into a template and compared against stored templates. This enables the software to make a confident ‘match’ or ‘no-match’ determination.

Identification vs Verification

Most people are familiar with using their face to unlock their phone. This is a process called verification. Larger systems do identification, which is actually a much harder technical problem.

Verification is when you verify a person is who they claim to be. (I am Mary. Go check this image to your stored image of Mary and verify I am telling the truth.) It’s also called 1:1 matching and computationally speaking, it’s a straightforward process.

Identification is when the system has to identify an unknown person. (Here is a picture. Who is it?) In this process the system compares the image to all enrolled images (the gallery) which results in a match or a no-match decision based upon a list of possible matching candidates. It’s also called 1:N matching. It’s a much more difficult process than verification.

We use the term “face recognition” all the time because it’s familiar but technically, it’s really called “face identification”.

Errors in Biometrics & Thresholds

Biometric identification is probabilistic. When a photograph is compared to a person, no biometric identification system can ever be 100% certain of a match. There are really complicated mathematical reasons for this and I may write about them later because I think the math is really interesting, but probably not because most people don’t like math. But think about, when you look at a person and a photograph you can’t be 100% sure that they are same person either. People are really good at recognizing familiar faces, but we aren’t really very good at recognizing strangers. Biometric identification systems are actually better than humans at recognizing faces of unfamiliar people.

So it’s not 100%. So when an images are compared, the system assigns it a similarity score. So if the system assigns a 12% similarity score, we are pretty confident that it is not a match. And if the system assigns a 99.6% similarity score, then we are pretty confident that it is a match. The dividing line between matches and non-matches is called the threshold. You have to draw the line somewhere and it’s never going to be perfect.

There are two kinds of errors in any classification system. A false non-match error is the failure to match a person who is enrolled in the system. A false match means a person is incorrectly matched a different person in the gallery. These two types of errors are inversely proportional which means as you reduce one kind of error you increase the other. It’s just how the math works.

Typically, the threshold is adjusted based on the application. If the system needs to be very secure, you can increase the threshold. You will be more certain that every match is correct, but you will get more false non-matches. That means some people are going to have to try a few times to get through the system which is annoying. For an application where the security isn’t quite so important, you can lower the threshold. That means you don’t have the same level of certainty for each match but you are very unlikely to miss a true match and inconvenience a customer.

Other Factors in Accuracy

With modern biometric identification systems, the matching algorithms from various vendors are all very good. Ten years ago, the performance of face matching algorithms was all over the map and vendors spent a lot of time promoting their algorithms as X% better than competing algorithms. These days, the algorithm is a commodity and false non-matches are mostly caused by human factors. If you are looking down at your phone as you walk past our system, it can’t get an image of your face. Face matching won’t work without an image of the face. Pose, illumination and expression (PIE, cute eh?) are the three primary factors for accuracy in face recognition. False matches in modern systems are extremely rare.

Identification documents issued by governments such as passports, visas and driver’s licenses generally require a plain background and prohibit smiling and hats or sunglasses. This is because earlier face recognition systems had trouble with these things. But advances in the field have made the technology more robust and these factors are no longer as important as they used to be. People can be easily fooled by changes in hair style, different make up or a hat but computers see in a fundamentally different way from computers and so these things don’t generally fool them. This advancement, and nearly all advancements in face recognition, are the work of 'neural' technology’ and the adaptive capabilities involved in machine learning.

Reliability

This may surprise you, but computers are almost always better than humans at recognizing faces. Partly, this is because people aren’t really that good at identifying strangers. Computers ‘see’ in a fundamentally different way than people. Also, consider the job of a bouncer checking the driver’s license of each person entering a bar. It’s a repetitive, tedious process and usually done in poor lighting conditions. Humans are terrible at this sort of task. We get bored easily and find it difficult to focus for long periods of time. At scale, a computer will always outmatch human performance because computers are really good at doing a large number of boring, repetitive tasks. Computers don’t always win, though. As image quality degrades, humans start to have an edge over computers - biologically, we excel at working with very limited data and making conclusions. But this advantage only applies to faces that we know really well - spouse, close friends, celebrities, etc.