India’s Sarvam AI beats Google Gemini and ChatGPT

India finally has an AI model that seems to be world-class, at least for some India-specific tasks. Sarvam AI has come out with an OCR tool called Vision that beats the likes of Gemini and ChatGPT in reading documents in Indian languages, as well as Bulbul V3 that excels at AI voice generation. When it comes to AI models, the spotlight is mostly on the US and China.
India, despite its scale and deep talent pool, has rarely been seen as a source of core AI development. But Bengaluru-based startup Sarvam AI is changing that perception with what it calls a “sovereign AI”. The company is creating foundational AI models from scratch in India. This week two of its tools, Sarvam Vision and Bulbul, are making a lot of buzz. All for the right reasons. Sarvam Vision is apparently beating bigger and more talked about AI models such as ChatGPT, Google Gemini and Anthropic Claude on certain benchmarks in optical character recognition (OCR), which is its area of expertise. Its performance is seemingly so good that it is winning praise from users and experts alike. Sarvam AI co-founder Pratyush Kumar recently shared details of the latest achievements from the company’s in-house AI models in a series of posts on X. According to the company, Sarvam Vision has achieved an accuracy score of 84.3 per cent on the olmOCR-Bench.
The score is higher than Gemini 3 Pro and recent OCR models such as DeepSeek OCR v2, while ChatGPT ranked significantly lower.








