Young Engineers Build Sarvam's Giant LLM

Sarvam’s 15 young engineers built a 105-billion-parameter LLM.
Rahul Aralikatte led the team, with Sumanth Doddapaneni key to pre-training.
Sarvam offers AI tools for India, including language-specific conversation apps.

Summarized by AI ⓘ

What is the story about?

Witness the incredible feat of 15 young engineers building Sarvam's massive 105-billion-parameter LLM from the ground up. Learn about their specialized roles and the innovative spirit that fueled this technological leap.

The Genesis of Giants

The creation of Sarvam's monumental 105-billion-parameter large language model (LLM) was an ambitious undertaking, spearheaded by a remarkably focused

team of just 15 engineers and researchers. Many of these individuals were in the early stages of their careers, often in their twenties and early thirties, demonstrating that innovation doesn't always require vast teams. The project's direction was guided by Rahul Aralikatte, a distinguished postdoctoral researcher affiliated with the Mila - Quebec AI Institute, a renowned institution led by Turing Award laureate Yoshua Bengio. Aralikatte's leadership encompassed critical areas such as data engineering, the intricate process of large-scale pre-training, the development of robust evaluation frameworks, and the implementation of essential safety guardrails for the sophisticated model. His expertise is underscored by his impressive portfolio of 6 US patents and over 30 peer-reviewed publications, which have garnered more than 1,000 citations, highlighting his significant contributions to the AI field.

Core Team Contributions

Within this core team of 15, specific individuals played pivotal roles in shaping the LLM. Sumanth Doddapaneni, a machine learning researcher at Sarvam and a PhD candidate on leave from IIT Madras, was instrumental in managing the extensive pre-training runs and contributed significantly to the data engineering efforts. Mohit Singla and Aashay Sachdeva were crucial in navigating both the data engineering and post-training phases, ensuring the model's refinement. Anna Upreti dedicated her focus to the critical aspects of safety and alignment during the model's post-training period, a vital component for responsible AI development. Sarvam cofounder Vivek Raghavan remarked on the team's dedication, affectionately referring to them as 'kids' while emphasizing their relentless work ethic, stating, 'The 105-billion-parameter model was trained completely from scratch by a very small team—about 15 people, most of whom are early in their careers. I sometimes jokingly call them ‘kids’, but they worked day and night to make it happen.' This sentiment underscores the high-impact potential of identifying and nurturing exceptional talent, with Raghavan noting the importance of finding individuals capable of being '10x or even 100x engineers.'

Sarvam's Full-Stack Vision

Sarvam operates as a comprehensive full-stack Generative AI company, with a broad reach that spans from the fundamental core models to the practical end-user applications. The foundation of their work involves meticulously training models from inception, followed by enhancements through advanced techniques like fine-tuning and reinforcement learning. Layered above these core models are sophisticated harness and orchestration systems that empower the LLMs to execute a diverse range of useful tasks effectively. This layered approach allows Sarvam to develop and deploy distinct product families catering to various needs. These include Sarvam for Conversations, designed to facilitate AI-powered communication tailored to India's diverse linguistic landscape; Sarvam for Work, which targets enterprise workflows and the automation of internal business processes; and Sarvam Studio, a platform engineered for creating content that is not only in Indian languages but also culturally resonant. Complementing these offerings, Sarvam provides accessible APIs for developers, enabling hundreds of companies and individual developers to integrate and build their own AI-powered solutions, further fostering innovation within the ecosystem.

Attracting India's AI Talent

Vivek Raghavan highlighted two significant factors that draw top-tier talent to Sarvam. Primarily, many individuals are deeply motivated by the opportunity to contribute to building technology specifically designed to address India's unique needs and challenges. Secondly, working at Sarvam offers a considerably broader scope of involvement compared to larger multinational corporations. In contrast, at a massive global company, an individual might be confined to working on a minuscule part of an extensive system. This environment allows for greater ownership and impact. India's younger generation, particularly those in their twenties, are rapidly emerging as 'AI natives,' mirroring a global trend where young entrepreneurs and developers are at the forefront of creating cutting-edge AI startups. This surge of talent in India is increasingly recognized for its ability to drive research aimed at enhancing the efficiency of AI models while ensuring they more accurately reflect the nation's rich linguistic and cultural diversity. This local innovation is seen as crucial for advancing AI capabilities, particularly in areas like general reasoning.

Young Engineers Build Sarvam's Giant LLM

Related Stories

The Genesis of Giants

Core Team Contributions

Sarvam's Full-Stack Vision

Attracting India's AI Talent

More stories you might like

India's Sarvam AI Unveils Open-Source Models: A Leap Forward for Local AI Innovation

India's AI Leap: Sarvam Unveils Open-Weight Models & Benchmarks Against Global Peers

Sarvam AI Unveils Open-Weight Models: A Deep Dive into Their Architecture and Performance

AI's Indian Embrace: Boosting Productivity and Reshaping the Workforce

AI's Future: People Power, Data Center Impact, and the Pushback Against Tech Empires

Women in Indian AI: A Rising Tide in Tech, Facing Leadership Hurdles

Sarvam AI Unveils Open-Weight Models: A Deep Dive into India's Sovereign AI Leap

How the CIO Collective 2026 is shaping India's tech future

Falcon AI: Revolutionizing Campus Placements in India with Smart Automation

'Want Unconditional Surrender From Iran, Not Looking To Settle War Now': Donald Trump Claims He's Winning By A Lot

AI Generated