We’ve been hearing a lot about generative AI and the amazingly useful content that it can produce. But while it’s easy to get wooed by the impressive content AIs can already output, it’s crucial that we also keep an eye on the problems they bring.
For starters, there’s the matter of copyright legality and ethics — the data AI was trained on was produced by someone, how are they repaid? Spoiler alert: almost no content creator is getting compensated. There’s also the matter of factuality and how AI tends to make a lot of stuff up with the utmost confidence; and then, there are the biases.
Machines Learning From Us
Any AI algorithm can only be as good as the data it is trained on. It’s a bit like making wine. No matter how good your process is, if the grapes are bad, the wine’s gonna be bad too. You can improve or change the quality of the grapes, but you can never really take it out.
The same thing goes for training an AI. Society is riddled with biases. Datasets (be it books, images, or any type of data) are also unlikely to be free of biases. So can artificial intelligence then also possess biases?
AI learns from data, and if that data comes tinted with human bias, the AI will adopt it. Let’s take another analogy: consider a sponge soaking up water from a puddle. If the puddle’s dirty, the sponge isn’t absorbing pure water.
Imagine a photo recognition tool trained primarily on photos of light-skinned individuals. Chances are, it might struggle to identify darker-skinned folks. This is not just a random example, it’s exactly what’s happening with algorithms at the moment. Many AI tools have shown biases against certain genders, races, and other groups because they were trained on limited or biased data sets.
Most of the time, it’s not because the underlying AI algorithm is inherently biased (although algorithms can also exacerbate existing biases). It’s because its learning material was skewed. This happens in all forms of media. For instance, research on AI in gaming has shown that biases are prevalent there as well. Even ChatGPT, the poster child of generative AI, displays various biases.
Multiple types of bias
So the data can be biased. But that’s not the only process through which humans can “leak” biases into algorithms.
Bots like ChatGPT are “taught” with the aid of a technique called reinforcement learning with human feedback. This technique is a bit like training a dog: you give the algorithm “treats” when it does well, and when it does something you don’t want it to do, you tell it off. It’s a way of shaping behavior by having human supervisors help the algorithm, particularly when it comes to interpreting complex topics.
This process is immensely impactful for the AI. In fact, in a recent podcast, OpenAI CEO Sam Altman said this is the bias that concerns him the most.
“The bias I’m most nervous about is the bias of the human feedback raters.” When asked, “Is there something to be said about the employees of a company affecting the bias of the system?” Altman responded by saying, “One hundred percent,” noting the importance of avoiding the “groupthink” bubbles in San Francisco (where OpenAI is based) and in the field of AI.
Other algorithms are trained with different methods. Some are less vulnerable to biases at this stage. Even algorithms that are indeed less vulnerable to biases can fall prey to tunnel vision, essentially producing a similar type of output at the cost of diversity and variety.
“We are essentially projecting a single worldview out into the world, instead of representing diverse kinds of cultures or visual identities,” said Sasha Luccioni, a research scientist at AI startup Hugging Face who co-authored a study of bias in text-to-image generative AI models.
Tackling bias in AI
You might think, “So, a software bot gets it wrong once in a while. Big deal?” But it’s not as simple as an occasional error. These biases can lead to critical issues. Imagine an AI system in hiring that prefers male candidates over female ones because it was trained on past hiring data dominated by male successes. Or consider a loan application algorithm that discriminates based on an applicant’s ethnicity due to historical lending data. The stakes, as you can see, are high — and the more we come to rely on algorithms, the higher they get.
The good news is that we’re aware of the problem — and being aware is the first step to finding a solution. Researchers are now focusing on creating more inclusive data sets. They’re examining AI algorithms with a magnifying glass, pinpointing sources of bias, and rectifying them. Groups worldwide are pushing for ‘ethical AI’, championing systems that serve all of humanity fairly.
But hard, stringent regulation is lacking so far. As is so often the case, technology evolves faster than we can regulate it. The emphasis has largely been on self-monitoring and self-correction. Yet, it’s essential that we, as a society, don’t just wing it and come up with some actual solutions. AI is poised to become more and more important in modern society, and it’s never been more important to understand how to make sure AI doesn’t cause more problems than it solves.
So, does AI have biases? The answer is yes, but not necessarily inherently. The biases mostly come from the data it learns from, and that data is a product of our actions and histories. Humans are biased, and we tend to perpetuate this.
Ultimately, AI is a tool—a mirror reflecting back at us. If we provide it with a clear and unbiased reflection, it will amplify our best qualities. But if we feed it our flaws and biases, it will magnify those as well.
Bringing positive change won’t be easy. But understanding the issue and advocating for change is the first step. In time, we can ensure that the AI of the future is as fair, unbiased, and beneficial as possible.
We’d be wise to do so.