Grab Superapp Says AI Models Struggle to Understand Asian Languages

Grab said that it had to develop an in-house AI model due to the unreliability of both proprietary and open-source AI models.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 4 November 2025 13:41 IST
Highlights
  • Grab has now built a specialised vision LLM for the eKYC process
  • The model is extracting information from user-submitted documents
  • Grab used both online and synthetic datasets to train the model

AI models struggle to understand non-English languages due to the limited datasets

Photo Credit: Unsplash/Rohan Solankurkar

Grab, the Singapore-based superapp company, highlighted on Monday that it was forced to develop an in-house artificial intelligence (AI) model for internal use. It is a lightweight vision large language model (LLM) that can scan documents and extract information from them. The company said the decision to develop the model was made as both proprietary and open-source models were not good at understanding Southeast Asian languages. The company's statement has raised fresh concerns around the accessibility of frontier models by Google, OpenAI, and Anthropic.

AI Models' Struggle With Non-English Languages

In a blog post detailing the architecture and training process of their in-house vision model, Grab highlighted the shortcomings they experienced when they tried to outsource the technology. “While powerful proprietary Large Language Models (LLMs) were an option, they often fell short in understanding SEA languages, produced errors, hallucinations, and had high latency. On the other hand, open-sourced Vision LLMs were more efficient but not accurate enough for production,” the post mentioned.

Advertisement

AI models' struggle with non-English languages is not a new finding. For years, researchers have pointed it out, and AI players have tried to fix the issue. However, despite gaining basic competence in popular foreign languages such as Hindi, Japanese, Spanish (Latin America and Spain), and Chinese, the models have yet to understand the lexicon enough to differentiate between the nuances. So, they might be useful in general conversations, but for enterprise or research-based needs, the applicability falls short.

For instance, a paper published earlier this year found that even AI models developed by Chinese companies are as bad in Chinese minority languages as are Western models. And the issue persists in both proprietary models from Google, OpenAI, Meta, and Anthropic, as well as in open-source models.

Advertisement

The reason behind this struggle is the lack of readily available, adequate datasets to train the model on these languages. This is one of the reasons major AI companies are partnering with Indian companies and institutions to collect more Indic language datasets. In July, Google teamed up with IIT Bombay to develop Indic language AI speech models. Meta is reportedly paying $55 an hour to contractors to train its models in the Hindi language, and OpenAI has announced a research collaboration with IIT Madras, backed by $500,000 from the ChatGPT maker.

While collecting data this way is expensive, it is still possible to eventually build large enough datasets in prominent Asian and other languages. However, the minority languages, such as the non-scheduled Indian languages, will still be a struggle for these models to gain competence in. And unless they can learn these languages, accessibility and functionality will always be limited.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Here Are the Top iPhone Discounts During Flipkart's Upcoming Summer Sale
  2. These Android Phones Will Be Discounted During the Upcoming Flipkart Sale
  3. These Smart TVs Will Get More Affordable During Amazon's Great Summer Sale
  4. Amazon Great Summer Sale: Home Appliances Deals Teased Ahead of Sale
  5. Amazon Great Summer Sale Announced: Check Sale Date, Bank Offers and More
  1. WhatsApp Could Soon Begin Testing Redesigned Liquid Glass UI for Chats on iOS: Report
  2. Huawei Nova 16 Series Leak Reveals Colour Options; Huawei Nova 16 Pro Max Model Expected to Debut
  3. Itel Power 80 Geekbench Listing Reportedly Reveals Key Specifications, Features
  4. Gemini App Reportedly Gets Extensive UI Redesign on iOS With New Animated Interface
  5. Capital B Secures $1.3 Million From Adam Back for Bitcoin-Focused Strategy
  6. Oppo Reno 16 Pro Bags Multiple Certifications Including TDRA, TÜV Rheinland; Reno 16F Gets NBTC Nod
  7. Lenovo Legion Y70 (2026) Chipset, Battery Capacity and Other Key Specifications Confirmed Weeks Ahead of Debut
  8. OnePlus Ace 7 Leak Suggests Gamers Can Expect Flagship Snapdragon 8 Elite Gen 5 Chipset, Active Cooling
  9. GameStop Makes Bold $56 Billion Play for eBay, Ready to Go Hostile
  10. Flipkart Sale 2026: Best Deals on Laptops From Samsung, Asus, Dell and HP
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.