Before facial recognition tech can be fair, it needs to be diverse
Share Now on:
As facial recognition software spreads, it brings the challenge of diversity along with it. So far, programs identify male, white faces far more accurately than they do black women, for example. A new IBM project aims to change that. Diversity in Faces is a data set of a million faces pulled from public domain pictures on Flickr. It gives computers a lot more to look at and process, and it introduces a way to better measure diversity in faces. John R. Smith is an IBM fellow and lead scientist of Diversity in Faces. He tells host Jed Kim that there’s nothing else like this.
The following is an edited transcript of their conversation.
John R. Smith: This is a first effort we are aware of which is an annotated data set specifically aimed at studying the diversity of faces. There are many data sets that have been developed which are used for training. When we looked at those, we saw tremendous skews — so not balanced in gender, not balanced in skin color, not balanced in age. So the data sets that are out there are just not what we really need to understand this.
Jed Kim: When you say diversity, how are you thinking of diversity when it comes to faces? Because I think a lot of people are going to hear that term and they’re going to think of it along race and gender lines.
Smith: Yes, we do mean that. But we mean more than that. So what we’re aiming at here is to get one level beneath that. Because we need the computer to be able to work in a space of quantitative measures, more objective measures of the face — things like craniofacial features, measures of distances and areas and ratios. We have annotations of facial contrast, facial symmetry. This is the way it can work to ensure we have a better coverage of the full space of diversity of the data that we’re using for training and so on.
Kim: So let’s say I am a research scientist outside of IBM. Can you give me an example of one of the ways I might use this?
Smith: Yes. So one of the things that we’ve done with this data set release is we implemented 10 facial coding schemes. And we went back to some of the strongest work in this field, the most-cited work that was aimed at characterizing human faces in different ways. We would love to see other researchers develop method No. 11, or 11 through 20, to have sort of this core set of images which is reasonably large. It gives us a common foundation and a basis for which we can compare different ideas, different computer vision methods so [that] we can really get to the core of the challenge, which is how do we measure diversity of the whole spectrum of human faces?
We’re here to help you navigate this changed world and economy.
Our mission at Marketplace is to raise the economic intelligence of the country. It’s a tough task, but it’s never been more important.
In the past year, we’ve seen record unemployment, stimulus bills, and reddit users influencing the stock market. Marketplace helps you understand it all, will fact-based, approachable, and unbiased reporting.
Generous support from listeners and readers is what powers our nonprofit news—and your donation today will help provide this essential service. For just $5/month, you can sustain independent journalism that keeps you and thousands of others informed.