Former author of one of the top 5 facial recognition servers in the world for multiple years running, here's what's going on: the industry has solved this issue, but the potential clients are seeking the lowest bidder, and picking the newer companies, the nepostically created not really players but well connected, and those companies have terrible implementations. This is not a case of the technology not there yet, we solved all these racial bias issues 10 years ago. But new companies with new training sets and new ML engineers that do not know any of the industry's history are now landing contracts with terrible quality models, but well connected sales channels.
This study finds a higher rate of correct identification for black people than for other ethnic groups, whereas a few years ago the problem seemed to be that the software was less effective at identifying black people.
Do you have some insight about why this reversal might have occurred?
To have a high quality facial recognition system it needs to include every possible combination of ethnicity, in addition to all of those they each need to include variations of daylight, of dappled light, of partial obscuring, night time illumination, across every variation of season, variations of expression and face angle, across variations of weather, variations of distance, across variations of things placed on a person's face, and then across all kinds of variations of video compression. All these face image variations in the training set enable the trained model to find and track the features that persist through all these variations. In truth it requires hundreds of millions of facial images to create an accurate facial recognition system. Most new companies and many that have been around for respectable periods are not realizing how much data collection, annotation and additional variation creation it requires for a high quality FR training set. The company I worked at spent 20 years collecting laser scans of real people to then create the augmented real person data set with several hundred million faces.
Your problem is evident in your question. What does “black” mean? It’s entirely subjective. Dwayne Johnson? Liv Tyler? Nelson Mandela? Barack Obama? Mariah Carey?
This is a semantic issue. Ethnic groups are constructs. A system which misidentifies people identifies all people poorly.
It doesn’t track across regions either. People labelled “white” by law in some countries (Brazil, South Africa, etc) would be classified as “black” elsewhere.
In England, the example here, we do not classify people the way the Us does, with its history of “one drop” politics. Many British people considered “white” are “black” in the US.
There is no scientifically valid way of defining who counts as “black” so any discussion of tuning a system based on this definition is a disaster.
Even the people commenting are talking about different groups based on their own culture and prejudices.
I believe the term "Black" in reference to a person when discussing the topic of facial recognition is only used in journalism. There is no "Black" in the facial recognition industry. There really is no identification of ethnicity in facial recognition. It is all just variations of human appearance, in a unbroken spectrum. The natural and ever present population of mixed race people basically destroy any sense of "race" or "ethnicity" within the software. The ONLY time race and ethnicity are included in facial recognition discussions is when some group trains an algorithm with biased data, creating a biased trained algorithm. That is a human failure to understand the problem they trained their data, not grasping the lack of critical data and its impact on the trained model's use. The technology itself operated exactly as designed, it was literally humans not understanding the subtle nature of what they were doing that is the issue.
> Many British people considered “white” are “black” in the US.
I'm also British, can you give an example of that? A minor celebrity/TV personality say?
To the extent you have a point though I think it's irrelevant anyway—they paused the programme because they found, according to whatever definition they measured with, it had that skew.
How recently? We had a home security camera and every time our (Black) son walked up to the door, the camera would classify him as an “animal”. This was as recently as 2022
In the other direction, my camera regularly identifies cats, crows, and shadows as people. I think recognition in security cameras has a very long way to go.
47 comments
Do you have some insight about why this reversal might have occurred?
This is a semantic issue. Ethnic groups are constructs. A system which misidentifies people identifies all people poorly.
It doesn’t track across regions either. People labelled “white” by law in some countries (Brazil, South Africa, etc) would be classified as “black” elsewhere.
In England, the example here, we do not classify people the way the Us does, with its history of “one drop” politics. Many British people considered “white” are “black” in the US.
There is no scientifically valid way of defining who counts as “black” so any discussion of tuning a system based on this definition is a disaster.
Even the people commenting are talking about different groups based on their own culture and prejudices.
> Many British people considered “white” are “black” in the US.
I'm also British, can you give an example of that? A minor celebrity/TV personality say?
To the extent you have a point though I think it's irrelevant anyway—they paused the programme because they found, according to whatever definition they measured with, it had that skew.
Frankly, I'm skeptical, but I'm willing to be convinced by reputable evidence.