Sarvam AI Launches Open-Source Models

Sarvam AI launched open-weight AI models (30B & 105B parameters) to reduce reliance on foreign tech.
The 105B model rivals pt-oss 120B & Qwen3-80B, excelling in reasoning, thanks to MoE architecture & multilingual training.
Developers & researchers globally can now access these models, built with Indian languages & safety features, via AIKosh & Hugging Face.

Summarized by AI ⓘ

What is the story about?

Discover Sarvam AI's groundbreaking open-weight models, now available for global use. Learn about their architecture, multilingual capabilities, and how they stack up against prominent AI systems.

New Open-Weight Models

Sarvam AI has launched two substantial foundational AI models, featuring 30 billion and 105 billion parameters respectively. These powerful language models are

now accessible for download under an open-source Apache 2.0 license, facilitated through platforms like AIKosh and Hugging Face. This initiative marks a pivotal moment for Indian AI development, aiming to reduce reliance on foreign technology and foster a more inclusive AI ecosystem. The models were initially showcased at the India-AI Impact Summit 2026, highlighting their advanced capabilities in reasoning and multilingual processing. Sarvam emphasizes that these models were built entirely in-house, leveraging extensive, high-quality datasets and state-of-the-art computing resources. The project benefited from the Indian government's IndiaAI Mission, receiving crucial GPU access and infrastructure support from Yotta and Nvidia, underscoring a collaborative effort in advancing indigenous AI capabilities. The company aims to empower developers and researchers worldwide with these robust tools.

Technical Innovations Revealed

Delving into the architecture, Sarvam's 30B and 105B models employ a sophisticated Mixture-of-Experts (MoE) transformer design. This design selectively activates parameters, leading to significant reductions in computational cost during operation. The 30B model is equipped with a 32,000-token context window, ideal for real-time conversational applications, while the larger 105B model boasts an expansive 128,000-token window, catering to more intricate, multi-step reasoning tasks. For enhanced efficiency, the Sarvam 30B model utilizes Grouped Query Attention (GQA) to optimize KV-cache memory without compromising performance. In contrast, the Sarvam 105B model incorporates DeepSeek-style Multi-head Latent Attention (MLA), further minimizing memory requirements for processing extended contexts. These architectural choices underscore Sarvam's commitment to creating efficient and scalable AI solutions.

Training Data and Multilingual Focus

The foundation of these advanced models lies in their comprehensive training data, which includes a diverse mix of code, general web content, specialized knowledge domains, mathematical datasets, and extensive multilingual resources. A significant portion of the training investment was dedicated to curating a rich multilingual corpus, specifically focusing on the 10 most widely spoken Indian languages. This deliberate emphasis on Indian languages is further supported by a custom-built tokenizer, meticulously trained from scratch to ensure efficient tokenization across all 22 scheduled Indian languages and their 12 distinct scripts. The tokenizer's effectiveness is evident in its superior performance compared to other open-source alternatives, demonstrating a remarkably low fertility score – the average number of tokens needed to represent a word. This advanced tokenization capability is key to the models' strong performance on Indic text.

Performance Benchmarks

Initial evaluations indicate that the Sarvam 105B model shows promising scaling behavior, outperforming the 30B model on various benchmarks during early training phases. When compared against other large language models of similar scale, the 105B model achieves performance comparable to models like pt-oss 120B and Qwen3-Next (80B) in general capabilities. It also exhibits robust performance in agentic reasoning and task completion, surpassing models such as DeepSeek R1, Gemini 2.5 Flash, and o4-mini on the Tau 2 Bench. However, the 105B model may not be the leading performer in code generation, as its results on the SWE-Bench Verified benchmark lagged behind its counterparts. The 30B model shows competitive results against Nemotron 3 Nano 30B, with slight advantages in coding (SWE-Bench Verified) and agentic reasoning (Tau2), though it performs slightly lower on benchmarks like Live Code Bench v6 and BrowseComp. Notably, Sarvam's 30B model achieves significantly higher throughput, delivering 20% to 40% more tokens per second than Qwen3 due to its optimized code and kernels.

Safety and Application

Sarvam AI has meticulously integrated safety features into its models through a supervised fine-tuning process. This involved training on datasets that address both standard and India-specific risk scenarios, incorporating adversarial prompts and 'jailbreak' style inputs identified through automated red-teaming. These challenging prompts were paired with policy-aligned, safe completions to ensure the models respond responsibly. Internally, the 30B model powers the Samvaad conversational agent platform, while the 105B model forms the backbone of the Indus AI assistant, designed for complex reasoning and agentic workflows. Both models are optimized for deployment across a wide array of hardware, including personal devices like laptops, making them versatile for various applications.

Sarvam AI Launches Open-Source Models

Related Stories

New Open-Weight Models

Technical Innovations Revealed

Training Data and Multilingual Focus

Performance Benchmarks

Safety and Application

More stories you might like

Sarvam AI Unveils Open-Weight Models: A Deep Dive into India's Sovereign AI Leap

India's AI Leap: Sarvam Unveils Open-Weight Models & Benchmarks Against Global Peers

India's Sarvam AI Unveils Open-Source Models: A Leap Forward for Local AI Innovation

Google rolls out Gemini in Chrome to India, bringing AI sidebar for smarter browsing

The Hidden AI Tax: Why Your Next Smartphone Will Cost More and Have Less RAM

Gemini AI in Chrome comes to India: How to use

AI's Future: People Power, Data Center Impact, and the Pushback Against Tech Empires

15 Engineers Forge Sarvam's Groundbreaking $10.5 Billion LLM

AI's Indian Embrace: Boosting Productivity and Reshaping the Workforce

Meta's AI Deepfake Dilemma: Oversight Board Urges Overhaul Amidst War Chaos

AI Generated