Summary

  • Patronus AI has launched an MLMM-as-a-judge, called Judge-Image, which is an AI tool evaluating systems that interpret images and produce text, with the aim of detecting and mitigating hallucinations and reliability issues.
  • Etsy has already implemented this technology to check the accuracy of captions for product images on its site.
  • Patronus chose Google’s Gemini model over OpenAI’s GPT-4V as the underlining model for its MLLM judge.
  • The company offers a free option which allows users to try the platform up to a certain limit, after which customers pay for evaluator usage or can engage enterprise options.
  • The company plans to expand into audio evaluation in the future.

By Michael Nuñez

Original Article