Nvidia’s Cosmos-Transfer1 is a diffusion-based conditional world model for multimodal controllable world generation.
 
                Photo Credit: Nvidia
It can be used to create training environments for physical AI such as robots and autonomous vehicles
 
            
            Nvidia released a new artificial intelligence (AI) model last week that can be used to train robots on simulation. Dubbed Cosmos-Transfer 1, the new world generation large language model (LLM) is aimed at AI-powered robotics hardware, also known as physical AI. The company has released the model in open source with a permissive licence, and interested individuals can download it from popular online repositories. The Santa Clara-based tech giant highlighted that the main advantage of the latest AI model is that users will have granular control over the generated simulations.
Simulation-based robotics training has gained wind in recent times due to the advancement in generative AI technology. This specific branch of robotics deals with hardware that uses an AI for its brain. Essentially, the training method trains the brain of the machine in various real-world scenarios so that it can handle a wider range of tasks. This is a big improvement compared to current robots in factories that are designed to complete a single task.
Nvidia's Cosmos-Transfer1 is part of the company's Cosmos Transfer world foundation models (WFMs) which ingest structured video input such as segmentation maps, depth maps, lidar scans and more to generate photoreal video outputs. These outputs can then be used as simulation ground to train physical AI.
In a paper published in the arXiv journal, the company stated that this model offers greater customisation than its predecessors. It enables varying the weight of different conditional inputs based on spatial location. Essentially, this will allow developers to generate highly controllable world generation. Another advantage of the model includes real-time world generation that is helpful in faster and more diverse training sessions.
Coming to model specifics, the Cosmos-Transfer1 is a diffusion-based model with seven billion parameters. It is designed for video denoising in the latent space, and can be modulated by a control branch. The model accepts text and video as input, and using both, it can generate a photorealistic output video. The model supports four types of control input videos including canny edge, blurred RGB, segmentation mask, and depth map.
The AI model has been tested on Nvidia's Blackwell and Hopper series chipsets, and the inference was run on the Linux operating system. The tech giant has made the AI model available with the Nvidia Open Model License Agreement which allows both academic and commercial usage.
Nvidia's Cosmos-Transfer1 AI model can be downloaded from the company's GitHub listing and Hugging Face listing. Another AI model with 14 billion parameters is expected to be released soon.
For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.
 Nothing Phone 3a Lite Reported to Launch in Early November: Expected Price, Specifications
                            
                            
                                Nothing Phone 3a Lite Reported to Launch in Early November: Expected Price, Specifications
                            
                        
                     HMD Fusion 2 Key Features, Specifications Leaked Online: Snapdragon 6s Gen 4, New Smart Outfits, and More
                            
                            
                                HMD Fusion 2 Key Features, Specifications Leaked Online: Snapdragon 6s Gen 4, New Smart Outfits, and More
                            
                        
                     Google Says Its Willow Chip Hit Major Quantum Computing Milestone, Solves Algorithm 13,000X Faster
                            
                            
                                Google Says Its Willow Chip Hit Major Quantum Computing Milestone, Solves Algorithm 13,000X Faster
                            
                        
                     Garmin Venu X1 With 2-Inch AMOLED Display, Up to Eight Days of Battery Life Launched in India
                            
                            
                                Garmin Venu X1 With 2-Inch AMOLED Display, Up to Eight Days of Battery Life Launched in India