DeepSeek-OCR Open-Source AI Model Changes How AI Models Read and Process Plain Text

DeepSeek-OCR AI model brings a new approach to compressing long context text via optical 2D mapping.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 21 October 2025 17:21 IST
Highlights
  • The DeepSeek model is currently available on GitHub
  • Within 24 hours of release, it has received over 6K likes
  • The model turns text into pixels to improve its context memory

DeepSeek-OCR can compress a 1,000-word article into 100 visual tokens

Photo Credit: Reuters

DeepSeek, on Monday, released a new open-source artificial intelligence (AI) model that changes how these machines analyse and process plain text. Dubbed DeepSeek-OCR, it uses 2D mapping to convert text into pixels to compress long context into a digestible size. The AI startup claims that large language models (LLMs) are more efficient in processing pixels over text, and the compression allows them to capture more relevant information to generate the response. Additionally, the new approach is also said to generate more accurate results compared to traditional methods.

DeepSeek-OCR Introduces Novel Technique to Process Text

Based on optical character recognition (OCR) technology, the latest DeepSeek AI model uses a new method to process information. It first converts plain text into images, and then analyses the content to generate responses. The promise is that by reading the text in an image, it also compresses and stores massive chunks of a document in a way that makes it easier for a model to remember and reason with the information.

At its core, the model introduces “Context Optical Compression,” an approach of turning long pages of text into images, then letting the model convert those images into a highly condensed “vision token” representation, which is much smaller in size than the usual text-token representation. To highlight the conversion, the makers say that a 1,000-word article could be processed with just 100 vision tokens.

Advertisement

How the model works is also interesting. First, a document image is captured. Then, a vision encoder, which is a custom module made by the researchers, analyses the image and breaks the information into smaller patches. It is then compressed into a smaller number of vision tokens. Then, a decoder takes these vision tokens and reconstructs the textual meaning.

Advertisement

Because the AI model is working with far fewer tokens, the downstream language model (or reasoning module) has less memory burden and can handle longer content or bigger documents.

Andrej Karpathy, Co-Founder of OpenAI and former Director of AI at Tesla, praised DeepSeek-OCR for its novel implementation of vision tokens. He said that the approach could lead to higher efficiency and has the potential for bidirectional attention. He also said that this method could lead to the elimination of the tokeniser, which would make models more efficient.

Advertisement

For those who want to try out the DeepSeek-OCR, the model is currently being hosted on GitHub, where it has received more than 6,700 likes in just 24 hours. The model is available with the permissive MIT licence for both academic and commercial use cases.

 

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: DeepSeek
Advertisement

Related Stories

Popular Mobile Brands
  1. iQOO Pad 5e Launched Alongside iQOO Watch GT 2 and iQOO TWS 5
  2. DeepSeek-OCR Could Change How AI Reads Text From Images
  3. WhatsApp Says AI Firms Can't Offer Chatbot Access via WhatsApp Business
  4. Realme GT 8, Realme GT 8 Pro With Ricoh GR Optics Launched: See Price
  5. Sony WH-1000XM6 Review: The Best Just Got Better
  6. Poco F8 Ultra Listing on NBTC Certification Site Hints at Imminent Launch
  7. OpenAI's AI-Powered Web Browser Is Here: Know What It Can Do
  8. BSNL Samman Plan For Senior Citizens Announced at This Price
  1. Samsung Galaxy XR Headset Launching Today: Know Price, Features, and Specifications
  2. Smartwatch Breakthrough Brings GPS Accuracy Down to a Few Centimetres
  3. SpaceX Launches 10,000th Starlink Satellite, Sets New Annual Record
  4. Scientists Discover New Seismic Clue to Predict Mount Etna Eruptions
  5. NASA and ESA Trace Mysterious Lunar Flashes to Meteors and Gas Leaks
  6. Valsala Club Is Streaming Now: Know All About the Malayali Comedy-Drama Movie
  7. Ganoshotru OTT Release: Know When and Where to Watch the Bengali Crime-Thriller Online
  8. Mr Shudai OTT Release: Know When and Where to Watch the Punjabi Horror-Comedy
  9. SpaceX May Miss First Crewed Moon Landing as NASA Reopens Artemis Bid
  10. OpenAI Introduces ChatGPT Atlas, an AI-Powered Web Browser With Agentic Capabilities
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.