• Home
  • Ai
  • Ai News
  • DeepSeek OCR Open Source AI Model Changes How AI Models Read and Process Plain Text

DeepSeek-OCR Open-Source AI Model Changes How AI Models Read and Process Plain Text

DeepSeek-OCR AI model brings a new approach to compressing long context text via optical 2D mapping.

DeepSeek-OCR Open-Source AI Model Changes How AI Models Read and Process Plain Text

Photo Credit: Reuters

DeepSeek-OCR can compress a 1,000-word article into 100 visual tokens

Click Here to Add Gadgets360 As A Trusted Source As A Preferred Source On Google
Highlights
  • The DeepSeek model is currently available on GitHub
  • Within 24 hours of release, it has received over 6K likes
  • The model turns text into pixels to improve its context memory
Advertisement

DeepSeek, on Monday, released a new open-source artificial intelligence (AI) model that changes how these machines analyse and process plain text. Dubbed DeepSeek-OCR, it uses 2D mapping to convert text into pixels to compress long context into a digestible size. The AI startup claims that large language models (LLMs) are more efficient in processing pixels over text, and the compression allows them to capture more relevant information to generate the response. Additionally, the new approach is also said to generate more accurate results compared to traditional methods.

DeepSeek-OCR Introduces Novel Technique to Process Text

Based on optical character recognition (OCR) technology, the latest DeepSeek AI model uses a new method to process information. It first converts plain text into images, and then analyses the content to generate responses. The promise is that by reading the text in an image, it also compresses and stores massive chunks of a document in a way that makes it easier for a model to remember and reason with the information.

At its core, the model introduces “Context Optical Compression,” an approach of turning long pages of text into images, then letting the model convert those images into a highly condensed “vision token” representation, which is much smaller in size than the usual text-token representation. To highlight the conversion, the makers say that a 1,000-word article could be processed with just 100 vision tokens.

How the model works is also interesting. First, a document image is captured. Then, a vision encoder, which is a custom module made by the researchers, analyses the image and breaks the information into smaller patches. It is then compressed into a smaller number of vision tokens. Then, a decoder takes these vision tokens and reconstructs the textual meaning.

Because the AI model is working with far fewer tokens, the downstream language model (or reasoning module) has less memory burden and can handle longer content or bigger documents.

Andrej Karpathy, Co-Founder of OpenAI and former Director of AI at Tesla, praised DeepSeek-OCR for its novel implementation of vision tokens. He said that the approach could lead to higher efficiency and has the potential for bidirectional attention. He also said that this method could lead to the elimination of the tokeniser, which would make models more efficient.

For those who want to try out the DeepSeek-OCR, the model is currently being hosted on GitHub, where it has received more than 6,700 likes in just 24 hours. The model is available with the permissive MIT licence for both academic and commercial use cases.

Comments

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: DeepSeek
Akash Dutta
Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
Diwali Blackout: How the AWS Outage Crippled Major Apps Across the World

Advertisement

Follow Us

Advertisement

© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »