March 29, 2024

Tyna Woods

Technology does the job

OpenAI’s DALL-E 2 is a new illustration of AI bias

You may have seen some weird and whimsical pictures floating around the internet recently. There’s a Shiba Inu dog wearing a beret and black turtleneck. And a sea otter in the style of “Girl with a Pearl Earring” by the Dutch painter Vermeer. And a bowl of soup that looks like a monster knitted out of wool.

These pictures weren’t drawn by any human illustrator. Instead, they were created by DALL-E 2, a new AI system that can turn textual descriptions into images. Just write down what you want to see, and the AI draws it for you — with vivid detail, high resolution, and, arguably, real creativity.

Sam Altman, the CEO of OpenAI — the company that created DALL-E 2 — called it “the most delightful thing to play with we’ve created so far … and fun in a way I haven’t felt from technology in a while.”

That’s totally true: DALL-E 2 is delightful and fun! But like many fun things, it’s also very risky.

A couple of the creative images generated by DALL-E 2.
Courtesy of OpenAI

There are the obvious risks — that people could use this type of AI to make everything from pornography to political deepfakes, or the possibility that it’ll eventually put some human illustrators out of work. But there is also the risk that DALL-E 2 — like so many other cutting-edge AI systems — will reinforce harmful stereotypes and biases, and in doing so, accentuate some of our social problems.

How DALL-E 2 reinforces stereotypes — and what to do about it

As is typical for AI systems, DALL-E 2 has inherited biases from the corpus of data used to train it: millions of images scraped off the internet and their corresponding captions. That means for all the delightful images that DALL-E 2 has produced, it’s also capable of generating a lot of images that are not delightful.

For example, here’s what the AI gives you if you ask it for an image of lawyers:

Courtesy of OpenAI

Meanwhile, here’s the AI’s output when you ask for a flight attendant:

Courtesy of OpenAI

OpenAI is well aware that DALL-E 2 generates results exhibiting gender and racial bias. In fact, the examples above are from the company’s own “Risks and Limitations” document, which you’ll find if you scroll to the bottom of the main DALL-E 2 webpage.

OpenAI researchers made some attempts to resolve bias and fairness problems. But they couldn’t really root out these problems in an effective way because different solutions result in different trade-offs.

For example, the researchers wanted to filter out sexual content from the training data because that could lead to disproportionate harm to women. But they found that when they tried to filter that out, DALL-E 2 generated fewer images of women in general. That’s no good, because it leads to another kind of harm to women: erasure.

OpenAI is far from the only artificial intelligence company dealing with bias problems and trade-offs. It’s a challenge for the entire AI community.

“Bias is a huge industry-wide problem that no one has a great, foolproof answer to,” Miles Brundage, the head of policy research at OpenAI, told me. “So a lot of the work right now is just being transparent and upfront with users about the remaining limitations.”

Why release a biased AI model?

In February, before DALL-E 2 was released, OpenAI invited 23 external researchers to “red team” it — engineering-speak for trying to find as many flaws and vulnerabilities in it as possible, so the system could be improved. One of the main suggestions the red team made was to limit the initial release to only trusted users.

To its credit, OpenAI adopted this suggestion. For now, only about 400 people (a mix of OpenAI’s employees and board members, plus hand-picked academics and creatives) get to use DALL-E 2, and only for non-commercial purposes.

That’s a change from how OpenAI chose to deploy GPT-3, a text generator hailed for its potential to enhance our creativity. Given a phrase or two written by a human, it can add on more phrases that sound uncannily human-like. But it’s shown bias against certain groups, like Muslims, whom it disproportionately associates with violence and terrorism. OpenAI knew about the bias problems but released the model anyway to a limited group of vetted developers and companies, who could use GPT-3 for commercial purposes.

Last year, I asked Sandhini Agarwal, a researcher on OpenAI’s policy team, whether it makes sense that GPT-3 was being probed for bias by scholars even as it was released to some commercial actors. She said that going forward, “That’s a good thing for us to think about. You’re right that, so far, our strategy has been to have it happen in parallel. And maybe that should change for future models.”

The fact that the deployment approach has changed for DALL-E 2 seems like a positive step. Yet, as DALL-E 2’s “Risks and Limitations” document acknowledges, “even if the Preview itself is not directly harmful, its demonstration of the potential of this technology could motivate various actors to increase their investment in related technologies and tactics.”

And you’ve got to wonder: Is that acceleration a good thing, at this stage? Do we really want to be building and launching these models now, knowing it can spur others to release their versions even quicker?

Some experts argue that since we know there are problems with the models and we don’t know how to solve them, we should give AI ethics research time to catch up to the advances and address some of the problems, before continuing to build and release new tech.

Helen Ngo, an affiliated researcher with the Stanford Institute for Human-Centered AI, says one thing we desperately need is standard metrics for bias. A bit of work has been done on measuring, say, how likely certain attributes are to be associated with certain groups. “But it’s super understudied,” Ngo said. “We haven’t really put together industry standards or norms yet on how to go about measuring these issues” — never mind solving them.

OpenAI’s Brundage told me that letting a limited group of users play around with an AI model allows researchers to learn more about the issues that would crop up in the real world. “There’s a lot you can’t predict, so it’s valuable to get in contact with reality,” he said.

That’s true enough, but since we already know about many of the problems that repeatedly arise in AI, it’s not clear that this is a strong enough justification for launching the model now, even in a limited way.

The problem of misaligned incentives in the AI industry

Brundage also noted another motivation at OpenAI: competition. “Some of the researchers internally were excited to get this out in the world because they were seeing that others were catching up,” he said.

That spirit of competition is a natural impulse for anyone involved in creating transformative tech. It’s also to be expected in any organization that aims to make a profit. Being first out of the gate is rewarded, and those who finish second are rarely remembered in Silicon Valley.

As the team at Anthropic, an AI safety and research company, put it in a recent paper, “The economic incentives to build such models, and the prestige incentives to announce them, are quite strong.”

But it’s easy to see how these incentives may be misaligned for producing AI that truly benefits all of humanity. Rather than assuming that other actors will inevitably create and deploy these models, so there’s no point in holding off, we should ask the question: How can we actually change the underlying incentive structure that drives all actors?

The Anthropic team offers several ideas. One of their observations is that over the past few years, a lot of the splashiest AI research has been migrating from academia to industry. To run large-scale AI experiments these days, you need a ton of computing power — more than 300,000 times what you needed a decade ago — as well as top technical talent. That’s both expensive and scarce, and the resulting cost is often prohibitive in an academic setting.

So one solution would be to give more resources to academic researchers; since they don’t have a profit incentive to commercially deploy their models quickly the same way industry researchers do, they can serve as a counterweight. Specifically, countries could develop national research clouds to give academics access to free, or at least cheap, computing power; there’s already an existing example of this in Compute Canada, which coordinates access to powerful computing resources for Canadian researchers.

The Anthropic team also recommends exploring regulation that would change the incentives. “To do this,” they write, “there will be a combination of soft regulation (e.g., the creation of voluntary best practices by industry, academia, civil society, and government), and hard regulation (e.g., transferring these best practices into standards and legislation).”

Although some good new norms have been adopted voluntarily within the AI community in recent years — like publishing “model cards,” which document a model’s risks, as OpenAI did for DALL-E 2 — the community hasn’t yet created repeatable standards that make it clear how developers should measure and mitigate those risks.

“This lack of standards makes it both more challenging to deploy systems, as developers may need to determine their own policies for deployment, and it also makes deployments inherently risky, as there’s less shared knowledge about what ‘safe’ deployments look like,” the Anthropic team writes. “We are, in a sense, building the plane as it is taking off.”