What Anthropic Researchers Found After Reading Claude’s ‘Mind’ Surprised Them
1 min read
Summary
AI now needs to be better understood by humans as it gets closer to surpassing human intelligence
New research from Anthropic helps pull back the curtain on AI processes
In two new papers, the company has demonstrated that it can now trace how large language models link concepts together to drive decision making, and has used this technique to analyse how the models behave on certain tasks, finding some surprising results
Most large language models can reply in multiple languages, but the researchers found that, despite this, the models use language-independent features for various concepts first, and then select a language to use second
The team also discovered that when prompted to generate the next line of a poem, the models chose a rhyming word for the end of the line first and worked backwards from there, meaning that the models are conducting a kind of long-term planning, which is not fully understood