A new Stanford study suggests AI still has a bias problem

By Mark Sullivan

March 16, 2022

A new report from the Stanford Institute for Human-Centered Artificial Intelligence (HAI) describes the rapid maturation of the AI industry in 2021, but also contains some sobering news about AI bias.

Natural language processing (NLP) models continued growing larger over the past year, according to the report. While that’s yielded impressive gains in language skill, it’s also failed to completely rid the AI of the nagging problems of toxicity and bias.

The AI Index 2022 Annual Report” measures and evaluates the yearly progress of AI by tracking it from numerous angles including R&D, ethics, and policy and government. Here are the biggest takeaways.

NLP leaps forward

Some of the most significant developments in AI over the past few years have occurred in the performance of natural language models–that is, neural networks trained to read, generate, and reason about language. Starting with the breakthrough BERT model developed by Google researchers in 2018, a steady stream of progressively larger language models, using progressively larger training data sets, has continued to see impressive (sometimes shocking) performance gains. NLP models now range into the hundreds of billions of parameters (connection points in a neural network where computations are run on input data), and the best ones exceed human levels of language comprehension and speech generation.

Such language models have always been prone to learning the wrong lessons from biases in their training data. According to the AI Index report, that problem has persisted as the models have gained more parameters.

One way researchers test generative language models for toxicity is by asking them leading questions such as “boys are bad because . . . (fill in the blank).” They try to “elicit” toxicity from the model. “A 280-billion parameter model developed in 2021 shows a 29% increase in elicited toxicity over a 117 million parameter model developed in 2018,” the researchers write. The 280-billion parameter model is the Gopher model developed by DeepMind, a subsidiary of Google’s parent company Alphabet. The 117-parameter model refers to the first version of the GPT generative language model developed by OpenAI.

DeepMind itself has acknowledged the need to explore the ethical implications of such a huge model, and has released a research paper that does just that.

AI Index co-director Jack Clark (currently co-founder of the AI company Anthropic, formerly of OpenAI) tells Fast Company that the AI industry is currently engaged in a debate over whether it’s best to remove toxicity and bias by more careful curation of the training data, or by increasing the size of the training data set to the point where the “good” training data pushes the bad content to the margins.

As tech companies large and small hurry to make big language models available through APIs or as cloud-based services, “it becomes critical to understand how the shortcomings of these models will affect safe and ethical deployment,” the researchers write.

Eyes on ethics

There are also signs that AI companies are not ignoring the bias and ethics challenges of the technology. Researchers with industry affiliations contributed 71% more publications year-over-year in 2021 at fairness-focused industry conferences, the report says.

“Algorithmic fairness and bias have shifted from being primarily an academic pursuit to becoming firmly entrenched as a mainstream research topic with wide-ranging implications,” the researchers write.

Managing bias isn’t the only challenge facing language models. While they exhibit better-than-human reading comprehension, they struggle with “abductive reasoning and inference,” which refers to inferring from a set of facts the most plausible explanation for something. (Example: If I leave my car door unlocked and return to find my stereo missing, I might infer from the available facts that a thief had been there.)

But even here the models are improving rapidly. Humans outperformed AI in abductive reasoning benchmarks by 9 points in 2019, the authors write, but by 2021 the gap had shrunk to one point.

Big models, big money

In a wider sense, the AI industry is maturing rapidly.

“It looks like we’re reaching that phase of the industrial revolution where the line on the graph is going up and to the right,” says Clark. “We’re at a place where the research is really good, the models are useful and obviously relevant, and it’s globalizing–it’s everywhere.”

Clark says he is struck by the high levels of investment pouring into the space. Venture capital investment in the tech industry exploded last year, and investment played a major role. CB Insights reported that VC funding hit $621 billion in 2021, an increase of 111% over the previous year. Private investment in AI in 2021 totaled around $93.5 billion, the AI Index report shows, which is more than double AI investment in 2020.

The investment is becoming more focused, too, another sign that the space is maturing. While the amount of investment increased, the number of newly-funded AI companies continued to drop last year, from 1,051 companies in 2019 and 762 companies in 2020 to 746 companies in 2021.

AI in general is becoming less expensive, more accessible, and higher performing. The cost to train an image classification model has decreased nearly three-fold, the report states. And the fastest possible training times for mature AI models in general has dropped by a factor of 27.

Apolitical, for now

Even though the U.S. and China are cast as combatants in a new geopolitical state actor competition–increasingly in the area of AI–researchers from the two countries worked together more than ever in 2021.

The U.S. and China had the by far the highest number of cross-country collaborations in AI publications of any other two countries from 2010 to 2021. And the pace is accelerating: American and Chinese collaborations increased five times within that same time window.

Clark says top AI researchers who did their post-graduate education in the U.S. or Canada (such as at the University of Toronto) over the past five years have now scattered to start their own companies or open their own research labs around the world. Many of these people are Chinese, and they don’t forget their colleagues from university days. They continue communicating with them and working with them on new research. It’s also true that big U.S. companies such as Microsoft have research facilities in China, and big Chinese tech companies such as Huawei and Alibaba have facilities in the U.S.

Such collaboration and cross-pollination could continue, unless the governments of countries like the U.S. and China begin to interfere for geopolitical or national security reasons, which Clark believes is quite possible.

While AI faces real technical and ethical challenges going forward, Clark sums up by pointing out how quickly the technology has advanced from the realm of research to a becoming a real industry.

“Five years ago, we were all sitting around talking about how some AI system had just beat somebody at GO, and now we have natural language models that say give me any piece of text and I will generate a block of text from it and it will be good,” he says. “That’s pretty amazing.”

 

(37)