Summary

  • Open-source tool OLMoTrace has been developed by the Allen Institute for AI (AI2) to allow users to see the relationship between large language models (LLMs) and their training data.
  • Developed in response to enterprise concerns about the lack of transparency of AI decision-making, the tool provides a direct link between model outputs and the original data.
  • OLMoTrace differs from existing approaches that offer source citations or augmented generation, instead using long unique text sequences to match outputs with specific documents in training datasets.
  • The tool can be used to fact-check model outputs, debug models, and ensure compliance with regulations in highly-regulated industries, as well as increasing trust in AI systems.
  • AI2 believes OLMoTrace could become an essential component of AI stacks as algorithmic transparency becomes mandated in regulated industries.

By Sean Michael Kerner

Original Article