The Genesis of Giants
The creation of Sarvam's monumental 105-billion-parameter large language model (LLM) was an ambitious undertaking, spearheaded by a remarkably focused
team of just 15 engineers and researchers. Many of these individuals were in the early stages of their careers, often in their twenties and early thirties, demonstrating that innovation doesn't always require vast teams. The project's direction was guided by Rahul Aralikatte, a distinguished postdoctoral researcher affiliated with the Mila - Quebec AI Institute, a renowned institution led by Turing Award laureate Yoshua Bengio. Aralikatte's leadership encompassed critical areas such as data engineering, the intricate process of large-scale pre-training, the development of robust evaluation frameworks, and the implementation of essential safety guardrails for the sophisticated model. His expertise is underscored by his impressive portfolio of 6 US patents and over 30 peer-reviewed publications, which have garnered more than 1,000 citations, highlighting his significant contributions to the AI field.
Core Team Contributions
Within this core team of 15, specific individuals played pivotal roles in shaping the LLM. Sumanth Doddapaneni, a machine learning researcher at Sarvam and a PhD candidate on leave from IIT Madras, was instrumental in managing the extensive pre-training runs and contributed significantly to the data engineering efforts. Mohit Singla and Aashay Sachdeva were crucial in navigating both the data engineering and post-training phases, ensuring the model's refinement. Anna Upreti dedicated her focus to the critical aspects of safety and alignment during the model's post-training period, a vital component for responsible AI development. Sarvam cofounder Vivek Raghavan remarked on the team's dedication, affectionately referring to them as 'kids' while emphasizing their relentless work ethic, stating, 'The 105-billion-parameter model was trained completely from scratch by a very small team—about 15 people, most of whom are early in their careers. I sometimes jokingly call them ‘kids’, but they worked day and night to make it happen.' This sentiment underscores the high-impact potential of identifying and nurturing exceptional talent, with Raghavan noting the importance of finding individuals capable of being '10x or even 100x engineers.'
Sarvam's Full-Stack Vision
Sarvam operates as a comprehensive full-stack Generative AI company, with a broad reach that spans from the fundamental core models to the practical end-user applications. The foundation of their work involves meticulously training models from inception, followed by enhancements through advanced techniques like fine-tuning and reinforcement learning. Layered above these core models are sophisticated harness and orchestration systems that empower the LLMs to execute a diverse range of useful tasks effectively. This layered approach allows Sarvam to develop and deploy distinct product families catering to various needs. These include Sarvam for Conversations, designed to facilitate AI-powered communication tailored to India's diverse linguistic landscape; Sarvam for Work, which targets enterprise workflows and the automation of internal business processes; and Sarvam Studio, a platform engineered for creating content that is not only in Indian languages but also culturally resonant. Complementing these offerings, Sarvam provides accessible APIs for developers, enabling hundreds of companies and individual developers to integrate and build their own AI-powered solutions, further fostering innovation within the ecosystem.
Attracting India's AI Talent
Vivek Raghavan highlighted two significant factors that draw top-tier talent to Sarvam. Primarily, many individuals are deeply motivated by the opportunity to contribute to building technology specifically designed to address India's unique needs and challenges. Secondly, working at Sarvam offers a considerably broader scope of involvement compared to larger multinational corporations. In contrast, at a massive global company, an individual might be confined to working on a minuscule part of an extensive system. This environment allows for greater ownership and impact. India's younger generation, particularly those in their twenties, are rapidly emerging as 'AI natives,' mirroring a global trend where young entrepreneurs and developers are at the forefront of creating cutting-edge AI startups. This surge of talent in India is increasingly recognized for its ability to drive research aimed at enhancing the efficiency of AI models while ensuring they more accurately reflect the nation's rich linguistic and cultural diversity. This local innovation is seen as crucial for advancing AI capabilities, particularly in areas like general reasoning.














