Understanding Gender and Racial Bias in AI, Part 2

December 7, 2020

This is Part 2 of my four-part series about gender and racial bias in artificial intelligence (AI). In Part 1, I discussed gender and racial bias as it relates to voice recognition for voice user interfaces (VUIs), as well as facial recognition—both of which are functions of artificial intelligence—and how to eliminate such biases from our user experiences. Now, in Part 2, I’ll focus on how such biases influence what people like and, thus, impact the creation of user personas—often whitewashing design—and describe how to overcome these impacts.

Don Norman and Jakob Nielsen call upon us to take “a user-centered approach to all aspects of the end-user’s interaction with a company, its services, and its products.” We accomplish this through empathy—the ability to think or say, “I hear you and can understand how that might feel.” But what happens when we over identify with our users? We introduce bias. What if our users are nothing like us? We might rely on stereotypes. In an increasingly diverse and divided world, it’s all too easy to find ourselves on one side of a fence or the other—over identifying with our users or not identifying with them at all.

Champion Advertisement

How do algorithmic bias, our design tools, and bad habits contribute to the whitewashing of design? The everyday tools we use to navigate our daily lives and our design work drive us toward creating design solutions that are similar to those we already know and like. As a tech community that is largely driven by white people, we are constantly served images of white people. These white faces and stories end up in our personas and user-experience maps and drive our design decision making. Such bias will persist unless we acknowledge this is happening and stop the whitewashing of our design deliverables and our design solutions.

What We Like

Research findings on attraction theory are old news. Like attracts like. We form friendships based on our perception of similar personality characteristics. We maintain friendships because of similarities in age, race, gender, and social status. [1] We marry people who look like us. We like dogs that look like us. [2] We buy cars that look like us—but only from the front, as minivan drivers should be relieved to hear. With our lives moving online more than ever, our interactions are not always with real people. We’re becoming more reliant on algorithms that have voices—and, in some cases, faces. We want robots and virtual assistants to look like us, too. [3]

Some of what we like is innate, and some is culturally programmed. We often perceive physically attractive people as possessing “more socially desirable personality traits.” [4] Even six- to 12-year-old boys, “perceived physical attractiveness is systematically related to social acceptance.” [5] This is known as the “what’s beautiful is good” phenomenon—and it’s dangerous because what we define as beautiful is “overwhelmingly defined as white.” [6] In the US, white Americans are the dominant cultural group—the majority shareholders of power, the ultimate ingroup. [7]

By 2004, “almost a hundred research studies [had] documented people’s tendency to automatically associate positive characteristics with their ingroups.” [8] “White Americans, on average, show strong implicit preference for their own group and relative bias against African Americans” and other minority racial or ethnic groups. [9] Research indicates that some African Americans implicitly favor white Americans over their own ingroup. [10] All racial and ethnic groups implicitly prefer lighter-complexioned groups. [11] We prefer youth over age. [12] We prefer higher status over lower status. [13] All of this leads to bias where we least want to see it—in courtrooms, in classrooms, [14] in elections, in our social-media feeds, in our digital assistants, and in our design tools.

Let’s take me as an example. When researching bias in AI, I recently read this statistic: “Women drive 70–80 percent of all consumer purchasing, through a combination of their buying power and influence.” I thought, Heck, yeah, we do! And people should recognize the power of NPR-listening, Rachel Maddow-watching, Elizabeth Warren supporters like me. Then I thought, Whoa, I just did the thing I’m telling other people not to do. I massively over identified.

Also, I can’t stand middle-aged white guys who mansplain my job to me. Men like this have participated in focus groups that I have facilitated, and I found myself as uninterested in their opinions as they were in mine. I cannot empathize with this guy. I like to pretend he doesn’t matter—even though the opinions of middle-aged white men can matter more than anyone else’s. As a result, I’ve written my now-conscious bias into past research findings.

In the pre-COVID-19 era of in-person research, we’ve all experienced participants who fit one stereotype or another: The bro who showed up late, carrying his Starbucks. (He had time for coffee, but couldn’t show up on time!?) The business man with food in his teeth and noxious aftershave. The college student who wouldn’t put her phone down. The grandmother who alternately apologized for being unable to complete a usability-test task and acted like you were torturing her. The middle-aged career woman having a hot flash during an interview. Our reactions to these participants—either positive or negative—reflect our biases and increase or decrease our empathy, respectively.

Starting with a Google Search

Despite recent debates about the usefulness of personas and articles such as “Why Personas Fail,” many of us still begin our design projects by creating personas. If you’re not using personas, you’re probably responsible for some kind of experience map. Whatever deliverable describes your users, you’ll base all of your downstream design decisions on these fictitious people, so you should expect to face all the same obstacles that I’ll outline next.

Many of us start this process with a Google Search: newbies and gurus alike want the latest, greatest template to kick start their deliverables. As shown in Figure 1, a Google Images search for persona template returns a sea of white faces. A Google All search for persona template results in a seemingly endless list of options. However, a closer inspection of the top five search results [15] indicates a disturbing sameness in the templates.

Search results for persona template in Google Images — Figure 1—Search results for *persona template* in Google Images

Across the sites in the top five results, there are 83 images—63 photographs and 20 avatars. Only 67 of them are unique. Nearly 20% appear in two or three of the top five sites. Most of the repeat images show young, slender, white people or young, slender, Asian women. One of only five Asian people, Nerdy Nina, is the winner of the game Some of These Things Are Like the Others. She appears on three of the five sites in the top search results. The profile that accompanies her image reads like a mashup of Asian-American meets Millennial stereotypes. Nina is a young, slender, pretty-but-not-too-pretty, plaid-and-glasses-wearing software engineer who likes to read. Can you imagine how much research it took to write that? Can you imagine my eyes rolling as I write this?

The cartoons and avatars were majority white—15 of 17 images—and male—11 of 17 images—which is not at all representative of the general population. Two of these—the one black woman and one of the white women—have the kind of body-mass index (BMI) that is generally achievable only in a cartoon. If you’re designing an awareness and support site for eating disorders, these avatars are for you.

Of the 50 unique photographs, only 45 of them were of human beings. Five were animals, and Team Dog was winning. (There were four dogs and one cat.) We must be losing our mind if we’re writing personas for animals. The cat was white. One of the dogs was white. Two of the dogs were half white. The fourth dog was brown with white spots. How much cultural programming are we soaking up if we can’t even achieve color diversity in personas for pets?

Thirty of the 45 photos of human beings were of white people (60%), which is lower than the general population. However, if you’re following along, you’ll now realize that there were as many nonhumans (10%) as there were black people (10%) or Asian people (10%). The Hispanic, or Latinx, community was way underrepresented at 4%. There was one Middle Eastern or North African (MENA) woman. Indigenous peoples were not represented at all. On a lighter note, I lost count of all the dudes with beards. Even some of the avatars had beards. When did guys with beards become a market segment of their own?

All of the images seemed to show fully able bodies. Admittedly, this is hard to gauge, but no one was sitting in a wheelchair, using an assistive device, using sign language, or wearing a hearing aid. Very few of the people in the images wore glasses, which is strange because, according to The Vision Council, 164 million Americans wear glasses, and Warby Parker is worth $1.75 billion. None of images depicted anyone who appeared to be over 50, but roughly 40% of the country was over 45 in 2010.

Overall, these pages and posts did okay on race—if your target audiences are white, black, or Asian-American—but less well on gender. If you’re designing for peak-earners, retirees, or plus-size people—or even just normal-sized people—you’re out of luck. Most of the persona people conformed to white American standards of beauty: they were young, skinny, and light skinned.

Software, Stock Photography, and Name Generators

Many other tools can autogenerate user photos. Some of them perform better on age diversity, while most of them perform worse on racial diversity than the curated persona templates I just described. User Forge is a Web application that can help you build personas. It includes its own random-image generator. I asked it to generate ten images for me. Eight of the ten photos were of white faces: young, skinny women and men over 40. Apparently, it’s okay to be old if you’re a white man.

Some of these photo generators are built into design tools such as Sketch. The demo of the Sketch plug in, TinyFaces, shows no people of color. The demo of Content Generator for Sketch automagically generates twenty images, but only one person of color appears among the demo images.

In Adobe XD, the plug-ins This Person Does Not Exist and UI Faces randomly generate images. UI Faces allows you to filter images by age (defaults to 18–40), gender (male or female), emotion (happiness or neutral), and hair color (black, brown, blonde), but not racial identity. I asked each of these tools to generate ten images. [16] The twenty images split evenly on gender, but were 95% white or Asian.

Online testing platforms such as Usertesting.com or its competitor Userlytics won’t tell you the race of your research participants. You’ll have to watch the videos and guess. Without careful screening, you might not get the participants you really need—or worse you’ll use income, geography, and education level as proxies for race, which can perpetuate biases.

Stock-photography sites are loaded with young, slim white people and light-skinned people of color who conform to white body standards. Tools such as Baby Name Voyager, which shows popular names by birth year, are full of Alexanders and Annes to the exclusion of Alejandros and Anabellas. You should use Name Generator only as a party game, not for professional design work. Fake Name Generator is a time-saver. It can generate names, home addresses, phone numbers, birth dates, heights, weights, and whacky details such as mother’s maiden name, credit-card numbers, and UPS tracking numbers. While the names themselves are not particularly diverse sounding, they are more inclusive of Latinx people than any of the tools I mentioned earlier. It’s up to you to match a photo to the randomly generated person.

So what should you do if you are designing products and apps that have wide appeal? And what should you do if you are designing for an industry that is well-known for its racial disparities—such as healthcare? Channel Chuck D and fight the power of your own biases and the biases of the tools you’re using.

How to Fight Bias in the Algorithms and in Ourselves

If you’re using any of the tools that I’ve described or other algorithm-driven tools to help you produce user personas, profiles, or UX maps, you must first be aware of the limitations of these tools. Then try to understand how your own habits contribute to a lack of diversity in your work.

Too often, we use a familiar pool of users to answer our research questions or validate our designs. We let our clients or in-house customer-service teams dictate who we can talk to. We get contact cards only for the people with whom they’re comfortable: people who sound and look like them. Or we reach out to our own family-and-friends network via social media and get the same like-minded people from our personal echo chambers. As members of the tech community, which is largely white, we are likely surrounded by other white people. But we’re probably not designing solely for white people. We must force ourselves and our organizations to recruit outside these well-known circles.

You might have to fight for recruitment budgets and, possibly, travel budgets—in a post-COVID-19 future—to meet your users in person, to better understand their world, their joys and sorrows, and their unique needs. Ask for money for new photography or ask research participants and customers to send you selfies and get permission to use them.

If you’re not sure how diverse your customer base really is, ask your analytics and marketing teams for this data. If race or gender is missing from the data, you have two choices: use the demographics from the U.S. Census Bureau or do your own research. If you write your own surveys or questionnaires, include questions about race, gender, age, and ableness—even BMI, if its applicable. Do the same when you write participant screeners for qualitative research.

If participants wonder why you’re asking them for these personal details, tell them you want to understand them better; to know what, if anything, is unique about their needs, so you can address those needs and design the right solutions for them. Demonstrate that you care about ingroups and outgroups equally. Their answers can help you overcome whatever stereotypes traditional and social media are feeding you. You’ll design easier-to-use, better-selling solutions that solve real problems, and you’ll deliver applications and products that have broader appeal.

For years, I failed to ask these questions. I gave in to clients, colleagues, and bosses who said it was gauche to ask about racial and gender identity. But guessing is worse. And acting as if these differences don’t exist doesn’t help either. It is essential that we bridge these gaps instead of pretending that they don’t matter to our design work. It’s no longer enough to be an empathetic UX designer. We all must be anti-sexist, anti-racist designers.

Read the rest of this four-part series on UXmatters

Endnotes

[1] Ville-Juhani, Ilmarinen, Jan-Erik Lönnqvist, and Sampo Paunonen. “Similarity-Attraction Effects in Friendship Formation: Honest Platoon-mates Prefer Each Other but Dishonest Do Not.” Personality and Individual Differences, April 2016.

[2] Roy, Michael M., and Nicholas J.S. Christenfeld. “Dogs Still Do Resemble Their Owners.” Psychological Science, Vol. 16, No. 9, August 2005.

[3] Izak, Benbasata, Angelika Dimokab, Paul A. Pavloub, Lingyun Qiuc. “The Role of Demographic Similarity in People’s Decision to Interact with Online Anthropomorphic Recommendation Agents: Evidence from a Functional Magnetic Resonance Imaging (fMRI) Study.” International Journal of Human-Computer Studies. 2020.

[4] Dion, Karen, Ellen Berscheid, and Elaine Walster. “What Is Beautiful Is Good.” (PDF) Journal of Personality and Social Psychology, Vol. 24, No. 3, 1972. Retrieved December 6, 2020.

[5] Kleck, Robert E., Stephen A. Richardson, and Linda Ronald. “Physical Appearance Cues and Interpersonal Attraction in Children.” Child Development, Vol. 45, No. 2, June 1974.

[6] Mok, Theresa A. “Asian Americans and Standards of Attractiveness: What’s in the Eye of the Beholder?” Cultural Diversity and Mental Health, Vol. 4, No. 1, January 1998.

[7] Any group with which a person identifies is, by definition, her or his ingroup.

[8] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[9] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[10] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[11] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[12] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[13] Dasgupta, Nilanjana. “Implicit Ingroup Favoritism, Outgroup Favoritism, and Their Behavioral Manifestations.” Social Justice Research, Vol. 17, No. 2, May 2004.

[14] A group of 68 white, elementary-school teachers listened to and rated a group of white and black students for personality, quality of response to a prompt, and current and future academic abilities. The white teachers uniformly rated white students higher than black students; black, English-speaking students; and students with low physical attractiveness. The researchers concluded that some of these children’s academic failures might be based on their race and dialect rather than their actual performance. (Indicating that no one is immune from cultural biases, the duo who performed this research labeled the nonblack, English dialect that the white students spoke Standard English rather than White English.) See DeMeis, Debra Kanai, and Ralph R. Turner, “Effects of Students’ Race, Physical Attractiveness, and Dialect on Teachers’ Evaluations.” Contemporary Educational Psychology, Vol. 3, No. 1, January 1978.

[15] For the purposes of this evaluation, I looked at the top six Google search results to obtain five complete examples. I rejected result number four, HubSpot’s Buyer Persona Template, because it doesn’t include any images. My evaluation included the following Web sites:

“Xtensio User Persona Template and Examples.” Xtensio, undated. Retrieved November 20, 2020.
Ho Tran, Tony. “5 Essentials for Your User Persona Template (with Examples).” InVision: Inside Design: September 30, 2019. Retrieved November 20, 2020.
Downs, Joseph. “20 Must-See User Persona Templates.” Justinmind, July 30, 2020. Retrieved November 20, 2020.
HubSpot. “Buyer Persona Template.” HubSpot, undated. Retrieved November 20, 2020. Because the template included no images, I did not include it in my calculations.
Liu, Doris. “18 Free Excellent User Persona Templates You Can’t Miss Out.” UX Planet, April 22, 2018. Retrieved November 20, 2020.
McCready, Ryan. “20+ User Persona Examples, Templates, and Tips for Targeted Decision-Making.” Venngage, July 25, 2019. Retrieved on November 20, 2020.

[16] I used Adobe XD on my Windows notebook computer to test these plug-ins. I do not have an Apple device on which to test the Sketch plug-ins, so relied on the demos to which I’ve referred.

In Artificial Intelligence Design | UX Design

Understanding Gender and Racial Bias in AI, Part 2

What We Like

Starting with a Google Search

Software, Stock Photography, and Name Generators

How to Fight Bias in the Algorithms and in Ourselves

Read the rest of this four-part series on UXmatters

Endnotes

No Comments

Join the Discussion

Sarah Pagliaccio

Other Columns by Sarah Pagliaccio

Other Articles by Sarah Pagliaccio

Other Articles on Artificial Intelligence Design

New on UXmatters