Apple Researchers Introduce Matrix3D, a Unified AI Model That Can Turn 2D Photos Into 3D Objects

Matrix3D can perform several photogrammetry subtasks, including pose estimation, depth prediction, and novel view synthesis.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 14 May 2025 17:41 IST
Highlights
  • Matrix3D utilises a multimodal diffusion transformer (DiT)
  • The model was developed in partnership with Nanjing University and HKUST
  • It is an open-source model available for download on GitHub

Researchers said that Matrix3D was trained using the masked learning technique

Photo Credit: Reuters

Apple researchers released a new artificial intelligence (AI) model that can generate 3D views from multiple 2D images. The large language model (LLM), dubbed Matrix3D, was developed by the company's Machine Learning team, in collaboration with Nanjing University and the Hong Kong University of Science and Technology (HKUST). The Cupertino-based tech giant has made the AI model available to the open community, and it can be downloaded via Apple's listing on GitHub. With Matrix3D, the researchers have unified the 3D generation pipeline to eliminate the risk of errors.

Apple's Matrix3D Innovates Multi-Task Photogrammetry

In a post, the tech giant detailed the research that went into the development of the Matrix3D AI model. While several 3D rendering models already exist, this one innovates the existing space by unifying the pipeline to create 3D views. Instead of having multiple models and components, here, a single LLM performs several photogrammetry subtasks such as pose estimation, depth prediction, and novel view synthesis.

Notably, Photogrammetry is the technique of obtaining accurate measurements and 3D information about physical objects and environments by analysing images. It is commonly used to create maps, 3D models, and measurements from 2D images taken from different angles.

Advertisement

The researchers have also published a paper about the new model on the online preprint journal arXiv. As per the researches, Matrix3D is based on a multimodal diffusion transformer (DiT) architecture. It can integrate data across multiple modalities such as image data, camera parameters, and depth maps.

Advertisement

In the paper, Apple researchers highlight that the model was trained using a mask learning strategy where a part of the image is obstructed, and the AI model is trained to find the right pixels that fit in the gap.

The researchers found that the LLM can generate an entire 3D object or scene view with just three images from different angles. While the dataset used to train the model was not disclosed, the model itself is available to download, modify, and redistribute via a permissive Apple licence on the company's GitHub listing.

 

For details of the latest launches and news from Samsung, Xiaomi, Realme, OnePlus, Oppo and other companies at the Mobile World Congress in Barcelona, visit our MWC 2025 hub.

Advertisement

Related Stories

Popular Mobile Brands
  1. Moto Watch Review: The Best Smartwatch Under Rs. 6,000 in 2026?
  2. OnePlus 15T Confirmed to Launch With a Larger Battery, Faster Charging
  3. Realme Narzo Power 5G With 10,001mAh Battery Launched in India: Price, Specifications
  4. Nothing Phone 4a vs Motorola Edge 70: Price in India, Features Compared
  5. Nothing Phone 4a, Phone 4a Pro Launched in India at This Price
  6. Lava Bold 2 5G India Launch Teased; Company Teases Design Ahead of Debut
  7. Vivo T5x 5G AnTuTu Score Exceeds 1 Million Points, Will Launch in India Soon
  8. Nothing Phone 4a vs Phone 3a: Price in India, Specifications Compared
  1. ISS Crew Prepares to Send Japan’s HTV-X1 Cargo Spacecraft Back to Earth After Four Months
  2. OpenAI’s Codex App Is Now Available on Windows, Can Be Downloaded via Microsoft Store
  3. OpenAI Teases GPT-5.4 AI Model Launch Just a Day After Releasing GPT-5.3 Instant
  4. Nothing Headphone (a) Launched With Adaptive ANC, Customisable Controls: Price, Specifications
  5. Granny OTT Release Date: When and Where to Watch the Village Mystery Thriller Online?
  6. Andhaka OTT Release: Where to Watch the Telugu Drama-Thriller Online?
  7. Pookie OTT Release: When and Where to Watch Vijay Antony’s Romantic Drama Online?
  8. WhatsApp Plus Paid Subscription Reportedly in Development With Additional Customisation Options, Up to 20 Pinned Chats
  9. Samsung Patent Hints at Potential Clamshell-Style Foldable With Two Cover Displays
  10. Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.