Why Anthropic’s Claude still hasn’t beaten Pokémon

AI firm Anthropic recently attempted to showcase its “artificial general intelligence” (AGI) capabilities with an experiment to see whether its AI agent Claude could play the Pokémon video game.
However, the experiment resulted in mixed results, leading observers to claim that AGI is still some way off.
On the one hand, Claude 3.7 Sonnet, Anthropic’s latest model, was able to make progress in the game, such as collecting multiple in-game Gym Badges, quicker than previous iterations of Claude.
This was because Claude 3.7 had “improved reasoning capabilities” that allowed it to “plan ahead, remember its objectives and adapt when initial strategies fail”.
On the other hand, Claude struggled to make consistent progress, often getting stuck and having to revisit towns or talk to the same unhelpful NPCs, prompting observer to question whether AI is ready to surpass humans just yet.

Fast Feed