Anthropic can now track the bizarre inner workings of a large language model
1 min read
Summary
A research scientist at AI firm, Anthropic, has revealed more about how large language models (LLMs) work through a technique called circuit tracing, which allows the decision-making process in an LLM to be tracked.
The findings show that different components of an LLM work independently of language with the choice of language being applied only once the answer has been decided on.
It was also discovered that LLMs can look ahead while computing, contradicting the assumption that they process input sequentially.
Furthermore, the analysis showed how LLMs can be made to hallucinate and the circumstances under which they do so, as well as why they occasionally offer incorrect information.
The work provides further evidence of the need for more in-depth study of LLM usage and performance and how they impact their environments.
The full research papers can be accessed here and here.