GSMA and Pleias Launch CommonLingua to Address African Language Gap in AI
The GSMA and AI company Pleias have released CommonLingua, an open-source language identification model designed to address the underrepresentation of African languages in AI. This model, part of the GSMA's initiative 'AI Language Models in Africa, by Africa, for Africa,' covers 334 languages, including 61 African languages. CommonLingua aims to improve the accuracy of language identification for African languages, which are often mislabeled by existing systems. The model operates on UTF-8 byte sequences, allowing consistent handling across various scripts. It is trained on open-licensed and public domain content, supporting digital inclusion and economic opportunities in Africa.