Grab Superapp Says AI Models Struggle to Understand Asian Languages

Grab said that it had to develop an in-house AI model due to the unreliability of both proprietary and open-source AI models.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 4 November 2025 13:41 IST
Highlights
  • Grab has now built a specialised vision LLM for the eKYC process
  • The model is extracting information from user-submitted documents
  • Grab used both online and synthetic datasets to train the model

AI models struggle to understand non-English languages due to the limited datasets

Photo Credit: Unsplash/Rohan Solankurkar

Grab, the Singapore-based superapp company, highlighted on Monday that it was forced to develop an in-house artificial intelligence (AI) model for internal use. It is a lightweight vision large language model (LLM) that can scan documents and extract information from them. The company said the decision to develop the model was made as both proprietary and open-source models were not good at understanding Southeast Asian languages. The company's statement has raised fresh concerns around the accessibility of frontier models by Google, OpenAI, and Anthropic.

AI Models' Struggle With Non-English Languages

In a blog post detailing the architecture and training process of their in-house vision model, Grab highlighted the shortcomings they experienced when they tried to outsource the technology. “While powerful proprietary Large Language Models (LLMs) were an option, they often fell short in understanding SEA languages, produced errors, hallucinations, and had high latency. On the other hand, open-sourced Vision LLMs were more efficient but not accurate enough for production,” the post mentioned.

AI models' struggle with non-English languages is not a new finding. For years, researchers have pointed it out, and AI players have tried to fix the issue. However, despite gaining basic competence in popular foreign languages such as Hindi, Japanese, Spanish (Latin America and Spain), and Chinese, the models have yet to understand the lexicon enough to differentiate between the nuances. So, they might be useful in general conversations, but for enterprise or research-based needs, the applicability falls short.

Advertisement

For instance, a paper published earlier this year found that even AI models developed by Chinese companies are as bad in Chinese minority languages as are Western models. And the issue persists in both proprietary models from Google, OpenAI, Meta, and Anthropic, as well as in open-source models.

Advertisement

The reason behind this struggle is the lack of readily available, adequate datasets to train the model on these languages. This is one of the reasons major AI companies are partnering with Indian companies and institutions to collect more Indic language datasets. In July, Google teamed up with IIT Bombay to develop Indic language AI speech models. Meta is reportedly paying $55 an hour to contractors to train its models in the Hindi language, and OpenAI has announced a research collaboration with IIT Madras, backed by $500,000 from the ChatGPT maker.

While collecting data this way is expensive, it is still possible to eventually build large enough datasets in prominent Asian and other languages. However, the minority languages, such as the non-scheduled Indian languages, will still be a struggle for these models to gain competence in. And unless they can learn these languages, accessibility and functionality will always be limited.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Google's Pixel Upgrade Program Lets You Get the Latest Model Every Year
  2. Here's When the Realme 16 Pro Series Will Launch in India
  3. OTT Releases This Week: Thamma, Mrs Deshpande, Raat Akeli Hai The Bansal Murders, and More
  4. Here's How Much The Redmi Note 15 5G Could Cost in India
  5. Meta's New AI Models Could Challenge Google, OpenAI in Image and Video Generation
  6. YouTube Bans Popular Channels for Making Misleading AI-Generated Movie Trailers
  7. Starlink satellite tumbles toward Earth after orbital failure
  8. Oppo Pad Air 5 Launch Date Announced: See Expected Features
  9. Netflix Is Bringing a New FIFA Game in Time for 2026 FIFA World Cup
  10. Dominic and The Ladies' Purse Streaming Now: Know Where to Watch It Online
  1. Astronomers Observe Black Hole Twisting Spacetime for the First Time, Confirming Einstein’s Theory
  2. Hubble Captures Rare Collision in Nearby Planetary System, Revealing Violent Planet Formation
  3. Scientists Rule Out Elusive Sterile Neutrino After 10-Year Hunt, Shaking Particle Physics
  4. NASA’s PUNCH Mission Provides First Continuous Views of Solar Eruptions Across Space
  5. Starlink Satellite Breaks Apart in Orbit, Begins Uncontrolled Fall Toward Earth After SpaceX Anomaly
  6. Four More Shots Please Final Season Out on Prime Video: Know Everything About This Show For One Last Time
  7. Godday Godday Chaa 2 Now Streaming Online: A Powerful Punjabi Comedy with Social Satire
  8. Pharma Streaming Now on JioHotstar: Everything You Need to Know About This Thought-Provoking Drama Online
  9. Mrs. Deshpande Now Streaming Online: A Powerful Drama Exploring Identity, Marriage and Strength
  10. Adobe Partners With Runway to Offer Firefly Users Early Access to Video Generation Models
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.