These new AI benchmarks could help make models less biased
1 min read
Summary
Stanford University researchers have published a paper proposing eight new benchmarks to reduce bias in AI.
Current measures used to assess bias, such as Anthropic’s DiscrimEval, focus on a model’s responses to decision-making scenarios involving varied demographic information and analyse them for discriminatory patterns.
However, the Stanford team said models such as Google’s Gemma-2 9b and OpenAI’s GPT-4 score highly on these measures but often display inaccuracies or inappropriate content.
The proposed benchmarks look at two dimensions: descriptive, which involves objective questions and metrics, and normative, which are more subjective value-based assessments.
It comes as concerns grow surrounding AI and ethics, with some arguing that machines will never be truly objective.