New research suggests that chatbots consistently show the same systemic biases as the human data they’re trained on, including gender discrimination, reports the Next Web (TNW).
The preprint study aimed to assess how increasing personalisation of large language models (LLMs) like ChatGPT may impact a diverse and growing user base. Growing dependence on these tools to solve everyday tasks invariably raises concerns about biases that may be buried in the models, which are trained on large quantities of data scraped from the internet.
Researchers tested five popular LLMs – ChatGPT (OpenAI), Claude (Anthropic), Llama (Meta), Mixtral (Mistral AI) and Qwen (Alibaba Cloud) – for biases with several scenarios, including gender. Presenting two identical user profiles, one male and one female, the researchers asked the models to suggest a target salary for an upcoming negotiation with striking results. ChatGPT’s o3 model, for example, recommended US$280,000 for the female applicant and US$400,000 for the male.
“The difference in the prompts is two letters; the difference in the advice is $120K a year,” co-author Ivan Yamshchikov, a professor of AI and robotics at the Technical University of Würzburg-Schweinfurt (THWS) in Germany, told TNW.
[See more: Microsoft executive lambasted for telling laid-off staff to comfort themselves with AI]
Pay gaps varied across industries, with the most pronounced difference in law and medicine, followed by business administration and engineering. Social sciences was the only field where the models offered near-identical advice for men and women.
Researchers also tested other areas, like career choices, goal-setting and behavioural tips. The models consistently offered different responses based on the user’s gender, even with identical qualifications and prompts.
While the paper still needs to undergo peer review, it is far from the only example of AI reinforcing systemic biases. From systematically downgrading female candidates in hiring to consistently recommending more medical care for white patients to disproportionately labelling Black defendants likely to reoffend, it’s clear that AI is subject to the same biases as the data used to train it.
Despite years of headlines like these, many still perceive AI as objective – making its consistent biases all the more dangerous. Technical fixes alone, the THWS researchers argue, will not solve the problem. They view clear ethical standards, independent review processes and greater transparency in the development and deployment of models as crucial to rooting out bias.