Summary

  • Stanford University researchers have published a paper proposing eight new benchmarks to reduce bias in AI.
  • Current measures used to assess bias, such as Anthropic’s DiscrimEval, focus on a model’s responses to decision-making scenarios involving varied demographic information and analyse them for discriminatory patterns.
  • However, the Stanford team said models such as Google’s Gemma-2 9b and OpenAI’s GPT-4 score highly on these measures but often display inaccuracies or inappropriate content.
  • The proposed benchmarks look at two dimensions: descriptive, which involves objective questions and metrics, and normative, which are more subjective value-based assessments.
  • It comes as concerns grow surrounding AI and ethics, with some arguing that machines will never be truly objective.

By Scott J Mulligan

Original Article