Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124


Nvidia CEO Jensen Huang said last year that we are now entering the era of physical AI. While the company continues to offer LLMs for software use cases, Nvidia is positions itself more and more as a provider of AI models for fully AI-powered systems, including agentic AI in the physical world.
At CES 2026, Nvidia announced a series of new models designed to push AI agents beyond chat interfaces and into physical environments.
Nvidia launched Cosmos Reason 2the latest version of its vision language model designed for embodied reasoning. Cosmos Reason 1, released last yearintroduced a two-dimensional ontology for embodied reasoning and currently leads Hugging Face’s physics reasoning for video ranking.
Cosmos Reason 2 builds on the same ontology while giving businesses more flexibility to customize applications and allowing physical agents to plan their next actions, in the same way that software agents reason through digital workflows.
Nvidia also released a new version of Cosmos Transfer, a model that allows developers generate training simulations for robots.
Other vision language models, such as Google’s PaliGemma And Pixtral Large by Mistralcan process visual input, but not all commercially available VLMs support reasoning.
“Robotics is at an inflection point. We are moving from specialized robots limited to single tasks to generalist specialized systems,” said Kari Briski, Nvidia’s vice president for generative AI software, during a briefing with reporters. She was referring to robots that combine broad fundamental knowledge with deep task-specific skills. “These new robots combine broad fundamental knowledge with deep skills and complex tasks. »
She added that Cosmos Reason 2 “enhances the reasoning skills that robots need to navigate the unpredictable physical world.”
Briski noted that Nvidia’s roadmap follows “the same asset model across all of our open models.”
“To create specialized AI agents, a digital workforce, or the physical embodiment of AI in robots and autonomous vehicles requires more than just a model,” Briski said. “First, AI needs computational resources to train and simulate the world around it. Data is the fuel that allows AI to learn and improve and we contribute to the world’s largest collection of open and diverse datasets, going beyond just opening model weights. Open libraries and training scripts give developers the tools to create AI specifically for their applications, and we publish blueprints and examples to help deploy AI as model systems.”
The company now offers open models specifically for physical AI in Cosmos, robotics, with the Gr00t open-reasoning vision-language-action (VLA) model and its Nemotron models for agentic AI.
Nvidia argues that open models in different branches of AI form a shared enterprise ecosystem that feeds data, training and reasoning to agents in the digital and physical worlds.
Briski said Nvidia plans to continue expanding its open models, including its Nemotron family, beyond reasoning to include a new RAG and integration model to make information more easily accessible to agents. The company released Nemotron 3the latest version of its agentic reasoning models, in December.
Nvidia announced three new additions to the Nemotron family: Nemotron Speech, Nemotron RAG and Nemotron Safety.
In a blog post, Nvidia said Nemotron Speech offers “low-latency, real-time speech recognition for live captions and voice AI applications” and is 10 times faster than other speech models.
Nemotron RAG is technically composed of two models: an integration model and a reclassification model, both of which can understand images to provide more multimodal information for data agents to leverage.
“Nemotron RAG is on top of what we call MMTab, or Massive Multilingual Text Embedding Benchmark, with strong multilingual performance while using less memory of computing power, so they are ideally suited for systems that need to process a large number of queries very quickly and with low delay,” Briski said.
Nemotron Safety detects sensitive data so that AI agents do not accidentally release personally identifiable data.