AI struggles with Indian code-mixing

AI transcription tools struggle with Indian code-mixed speech
Humyn Labs finds Amazon and OpenAI models fail at mixed languages
Tools need testing on natural local dialects to improve accuracy

Summarized by AI ⓘ

Mastering AI

SEE ALL

Ravi Venkatesan

How much is technology killing our thinking?

LinkedIn News

AI in healthcare: A game changer or danger?

Harshad Bhagwat

Relying on AI for meetings? Here's what's wrong!

What is the story about?

Discover why AI transcription tools miss the mark with Indian languages and what's needed for them to truly understand how people speak.

The Lingering Language Gap

Recent investigations by Humyn Labs have brought to light a significant deficiency in current artificial intelligence transcription services: their inability

to accurately process Indian languages. The primary challenge emerges when speakers naturally intersperse English words with their native tongue, a common linguistic practice in India often referred to as code-mixing. These sophisticated AI systems, largely trained on English-centric datasets and internet-derived benchmarks, frequently misinterpret or fail to transcribe these mixed-language utterances effectively. Manish Agarwal from Humyn Labs points to the foundational issue being rooted in the prevailing English-first approach to AI development, coupled with a lack of rigorous, independent validation tailored to the unique linguistic landscape of India. This disconnect means that the marketed capabilities of many AI transcription tools do not align with their real-world performance when interacting with the diverse spoken dialects of the Indian subcontinent, leading to frustration and inefficacy for users.

Testing Beyond Claims

The findings from the Humyn Labs study starkly contrast the advertised performance metrics of AI transcription tools with their actual functionality when confronted with authentic Indian speech patterns. While platforms often boast high accuracy rates, the research demonstrates a significant performance gap, especially concerning code-mixed audio. Among the tested tools, Deepgram Nova-3 showed a commendable ability to grasp the intended meaning of the spoken content, even within complex linguistic structures. However, other prominent solutions, including Amazon Transcribe and various OpenAI models, exhibited considerable shortcomings. Intriguingly, even an AI tool specifically engineered with Indian languages in mind, Sarvam AI, encountered difficulties when faced with the natural blending of English and local languages. This underscores a crucial recommendation from the study: for AI technologies to genuinely serve the Indian populace, they must undergo thorough testing using speech that mirrors how people actually communicate – a vibrant tapestry of mixed languages and dialects.