OpenAI's o3 Model Claims Human-Level Intelligence on Benchmark, But It Might Not Be That Smart

OpenAI’s o3 AI model scored 85 percent on the ARC-AGI benchmark, matching the average human score.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 3 January 2025 18:33 IST
Highlights
  • The previous best score by an AI model was 55 percent
  • OpenAI has not shared details about the model architecture
  • The ARC-AGI test includes a series of pattern-based IQ questions

The o3 AI model is currently available in early access to external testers

OpenAI unveiled the reasoning-focused o3 series of artificial intelligence (AI) models last month. During a live stream, the company shared the benchmark scores of the model based on internal testing. While all of the shared scores were impressive and highlighted the improved capabilities of the successor to o1, one benchmark score stood out. On the ARC-AGI benchmark, the large language model (LLM) scored 85 percent, beating the previous best score by a 30 percent margin. Interestingly, this score is also on par with what an average human scored on the test.

OpenAI Scores 85 Percent on ARC-AGI Benchmark

However, just because o3 scored such a high score on the test, does it mean its intelligence is equal to that of an average human? This would be easier to answer if the AI model was released in the public domain and we could test it out. Since OpenAI has not disclosed anything about the model's architecture, training techniques, or datasets, it is difficult to conclusively claim anything.

Advertisement

There are certain things that we do know about the AI firm's reasoning-focused models which can help us understand just what to expect from OpenAI's upcoming LLM. Firstly, so far, the o-series models do not have a major overhaul in their architecture or framework but are fine-tuned to showcase enhanced capabilities.

For instance, developers used a technique with the o1 series of AI models called test-time compute. With this, the AI models were given additional processing time to spend on a question and a workspace to test the theories and correct any mistakes. Similarly, the GPT-4o model was just a fine-tuned version of the GPT-4.

It is unlikely that the company would have made major changes to the architecture with the o3 model, given that it is also rumoured to be working on the GPT-5 AI model, which could be launched later this year.

Advertisement

Coming to the ARC-AGI (Abstract Reasoning Corpus - Artificial General Intelligence) benchmark, it features a series of grid-based pattern recognition questions that require reasoning and spatial understanding capabilities to solve. This could be done with a large dataset of high-quality data focusing on reasoning and aptitude-based logic.

However, if this were that simple, older AI models would have scored high on the test as well. Notably, the previous highest score was 55 percent as opposed to o3's 85 percent score. This highlights that the developers have added new refinement techniques and algorithms to enhance the reasoning capabilities of the model. The full extent of it cannot be stated unless OpenAI officially reveals the technical details.

Advertisement

That being said, it is unlikely that the o3 AI model would have reached AGI or human-level intelligence. Firstly, if that were the case, it would mark the end of the company's partnership with Microsoft, which is slated to end once OpenAI models hit the AGI status. Second, many AI experts, including Geoffrey Hinton, the godfather of AI, have repeatedly highlighted that we are multiple years away from reaching AGI.

Finally, AGI is such a big accomplishment that if OpenAI did reach that milestone, it would explicitly let people know instead of sharing subtle hints about it. What is far more likely here is that the o3 AI model has found a way to improve the pattern-based reasoning capabilities of the model (either by adding enough sampling data or by tweaking the training methods), as also highlighted in a PTI report.

Advertisement

However, this improvement is likely very isolated and does not mean an increase in the overall intelligence level of the model.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. iQOO Z11 Global Variant Visits Geekbench With a Different Snapdragon Chip
  2. You Can Now Turn Your PS5 Into a Linux Gaming PC
  3. Valathu Vashathe Kallan OTT Release: Where to Watch Malayalam Crime Thriller Online
  4. EA Sports FC 26, Wuchang: Fallen Feathers and Nine Sols Join PS Plus in May
  5. Moto Buds 2 Plus Launched in India With ANC, Up to 40 Hours of Total Playback Tim
  6. Moto G87 Launched With 200-Megapixel Main Camera, 5,200mAh Battery
  7. These Four Xiaomi Phones Are Now Eligible to Get Android 17 Beta Updates
  8. CMF Watch 3 Pro India Launch Finally Confirmed, Here's What to Expect
  9. Google TV Update Adds YouTube Shorts, Nano Banana and Veo Features
  10. Vivo X Fold 6 Leaks Reveal 200-Megapixel Camera and 7,000mAh Battery
  1. ULA Atlas V Launches 29 Amazon Kuiper Satellites in Return Mission
  2. Moto Buds 2 Plus Launched in India With Hi-Res Audio, Up to 40 Hours of Total Playback Time: Price, Features
  3. iQOO Z11 Global Variant Spotted on Geekbench Database With Snapdragon Chipset, Unlike Chinese Model
  4. Samsung Reportedly Plans to Launch Galaxy Book Models With Android-Based One UI 9 Soon
  5. PS5 Linux Loader Gets Public Release, Allowing Users to Run Steam and PC Games on Console
  6. Nine Crypto Scam Centres Targeting US Users Shut Down in Joint Operation Involving UAE, US and China
  7. Google Photos Unveils New AI-Powered Wardrobe Feature to Help You Decide What to Wear
  8. OpenAI CEO Sam Altman Teases GPT-5.5 Cyber AI Model Rollout, Could Take On Anthropic’s Claude Mythos
  9. Vivo X Fold 6 Leaks Hint at 200-Megapixel Camera, MediaTek Dimensity 9500 Chip and 7,000mAh Battery
  10. Raakaasa OTT Release Date Confirmed: Know When and Where to Watch it Online
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.