There's a simple answer to the AI bias conundrum: More diversity

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

As we approach the two-year anniversary of ChatGPT and the subsequent “Cambrian explosion” of generative AI applications and tools, it has become apparent that two things can be true at once: The potential for this technology to positively reshape our lives is undeniable, as are the risks of pervasive bias that permeate these models.

In less than two years, AI has gone from supporting everyday tasks like hailing rideshares and suggesting online purchases, to being judge and jury on incredibly meaningful activities like arbitrating insurance, housing, credit and welfare claims. One could argue that well-known but oft neglected bias in these models was either annoying or humorous when they recommended glue to make cheese stick to pizza, but that bias becomes indefensible when these models are the gatekeepers for the services that influence our very livelihoods.

So, how can we proactively mitigate AI bias and create less harmful models if the data we train them on is inherently biased? Is it even possible when those who create the models lack the awareness to recognize bias and unintended consequences in all its nuanced forms?

The answer: more women, more minorities, more seniors and more diversity in AI talent.

Early education and exposure

More diversity in AI shouldn’t be a radical or divisive conversation, but in the 30-plus years I’ve spent in STEM, I’ve always been a minority. While the innovation and evolution of the space in that time has been astronomical, the same can’t be said about the diversity of our workforce, particularly across data and analytics.

In fact, the World Economic Forum reported women make up less than a third (29%) of all STEM workers, despite making up nearly half (49%) of total employment in non-STEM careers. According to the U.S. Department of Labor Statistics, black professionals in math and computer science account for only 9%. These woeful statistics have remained relatively flat for 20 years and one that degrades to a meager 12% for women as you narrow the scope from entry level positions to the C-suite.

The reality is, we need comprehensive strategies that make STEM more attractive to women and minorities, and this starts in the classroom as early as elementary school. I remember watching a video that the toy company Mattel shared of first or second graders who were given a table of toys to play with. Overwhelmingly, girls chose traditional ‘girl toys,’ such as a doll or ballerina, but ignored other toys, like a race car, as those were for boys. The girls were then shown a video of Ewy Rosqvist, the first woman to win the Argentinian Touring Car Grand Prix, and the girls’ outlook completely changed.

It’s a lesson that representation shapes perception and a reminder that we need to be much more intentional about the subtle messages we give young girls around STEM. We must ensure equal paths for exploration and exposure, both in regular curriculum and through non-profit partners like Data Science for All or the Mark Cuban Foundation’s AI bootcamps. We must also celebrate and amplify the women role models who continue to boldly pioneer this space — like CEO AMD Lisa Su, OpenAI CTO Mira Murati or Joy Buolamwini, who founded The Algorithmic Justice League — so girls can see in STEM it isn’t just men behind the wheel.

Data and AI will be the bedrock of nearly every job of the future, from athletes to astronauts, fashion designers to filmmakers. We need to close inequities that limit access to STEM education for minorities and we need to show girls that an education in STEM is literally a doorway to a career in anything.

To mitigate bias, we must first recognize it

Bias infects AI in two prominent ways: Through the vast data sets models are trained on and through the personal logic or judgements of the people who construct them. To truly mitigate this bias, we must first understand and acknowledge its existence and assume that all data is biased and that people’s unconscious bias plays a role.

Look no further than some of the most popular and widely used image generators like MidJourney, DALL-E, and Stable Diffusion. When reporters at the The Washington Post prompted these models to depict a ‘beautiful woman,’ the results showed a staggering lack of representation in body types, cultural features and skin tones. Feminine beauty, according to these tools, was overwhelmingly young and European — thin and white.

Just 2% of the images had visible signs of aging and only 9% had dark skin tones. One line from the article was particularly jarring: “However bias originates, The Post’s analysis found that popular image tools struggle to render realistic images of women outside the western ideal.” Further, university researchers have found that ethnic dialect can lead to “covert bias” in identifying a person’s intellect or recommending death sentences.

But what if bias is more subtle? In the late 80s, I started my career as a business system specialist in Zurich, Switzerland. At that time, as a married woman, I wasn’t legally allowed to have my own bank account, even if I was the primary household earner. If a model is trained on vast troves of women’s historical credit data, there’s a point in some geographies where it simply doesn’t exist. Overlap this with the months or even years some women are away from the workforce for maternity leave or childcare responsibilities — how are developers aware of those potential discrepancies and how do they compensate for those gaps in employment or credit history? Synthetic data enabled by gen AI may be one way to address this, but only if model builders and data professionals have the awareness to consider these problems.

That’s why it’s imperative that a diverse representation of women not only have a seat at the AI table, but an active voice to construct, train and oversee these models. This simply can’t be left to happenstance or the ethical and moral standards of a few select technologists who historically have represented only a sliver of the richer global population.

More diversity: A no-brainer

Given the rapid race for profits and the tendrils of bias rooted in our digital libraries and lived experiences, it’s unlikely we’ll ever fully vanquish it from our AI innovation. But that can’t mean inaction or ignorance is acceptable. More diversity in STEM and more diversity of talent intimately involved in the AI process will undoubtedly mean more accurate, inclusive models — and that’s something we will all benefit from.

Cindi Howson is chief data strategy officer at ThoughtSpot and a former Gartner Research VP.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!