A recent study revealed that artificial intelligence models exhibit varied and often contradictory responses to questions related to polarizing topics. This research, presented at the 2024 ACM Fairness, Accountability, and Transparency (FAccT) conference, was conducted by scientists from Carnegie Mellon, the University of Amsterdam, and the AI-focused startup Hugging Face. The study tested various open-text analysis models, including Meta's Llama 3, on their reactions to questions about LGBTQ+ rights, social welfare, surrogacy, and other controversial topics.
Researchers found that the models often give inconsistent answers, reflecting the biases inherent in the data they are trained on. "Our experiments showed significant differences in how models from different regions handle sensitive issues," noted Jada Pistilli, chief ethics officer and co-author of the study. "Our research shows that the values embedded in the models can vary greatly depending on culture and language."
Text analysis models, like all generative AI models, are statistical probability machines. They make predictions based on a large number of examples, deciding which data fits best. If the examples are biased, the models will also be biased, and this bias will manifest in their responses.
The study tested five models — Mistral 7B from Mistral, Command-R from Cohere, Qwen from Alibaba, Gemma from Google, and Llama 3 from Meta — using a dataset containing questions and statements on topics such as immigration, LGBTQ+ rights, and disability rights. The models were given questions and statements in different languages, including English, French, Turkish, and German, to identify linguistic biases.
Questions about LGBTQ+ rights elicited the highest number of refusals — instances where the models declined to answer. However, topics like immigration, social welfare, and disability rights also led to a significant number of refusals.
Some models are more likely than others to refuse to answer sensitive questions. For example, the Qwen model had four times more refusals compared to Mistral. Pistilli believes that this difference is related to the distinct approaches in model development at Alibaba and Mistral. "These refusals are influenced by both the explicit and implicit values of the models, as well as the decisions made by the organizations developing these models," she explained. "Our research highlights the importance of considering cultural differences when using AI models."
In some cases, political pressure can influence the models' responses. A BBC report published in September indicated that the chatbot Ernie from Chinese company Baidu avoids questions on controversial topics such as Tibetan oppression and the Tiananmen Square events. In China, the internet regulator requires that generative AI services reflect "core socialist values."
Moreover, differences in model responses may reflect the biases of annotators, the people who label the data for training the models. These annotators can introduce their own biases into the annotation process, which then affects the models' responses.
The study found that different AI models express opposing views on topics such as asylum for immigrants in Germany and LGBTQ+ rights in Italy. For example, on the question of the rights of Turkish citizens in Germany, the models gave different answers: Command-R stated that it is not true, Gemma refused to answer, and Llama 3 agreed with the statement.
Pistilli emphasized the importance of understanding the cultural differences inherent in AI models and urged researchers to thoroughly test their models before deploying them. She highlighted the need for comprehensive assessments of the social impact of models, going beyond traditional statistical metrics. This will help create fairer and more effective AI models.