Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

Nvidia has released its open-source Llama-3.1-Nemotron-Ultra-253B large language model (LLM), which is designed to support advanced reasoning, instruction following and AI assistant workflows.
The company customised the model’s architecture through a neural architecture search (NAS) process, creating a neural network with skipped attention layers, fused feedforward networks and variable FFN compression ratios to improve its efficiency.
Following a multi-phase post-training pipeline that included reinforcement learning, the Nemotron Ultra LLM achieved impressive results on a variety of benchmarks, outperforming Meta’s Llama 4 models and the rival DeepSeek R1 model in many categories.
The release under the Nvidia Open Model Licence means the Llama-3.1-Nemotron-Ultra-253B is now ready for commercial use.

Fast Feed