
Abu Dhabi — Inception, a G42 company specializing in AI-native products, and Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), the world’s first AI-focused graduate research university, through its Institute of Foundation Models and in collaboration with Cerebras, today announced the launch of SHERKALA, a revolutionary Kazakh Large Language Model (LLM) designed to empower over 13 million Kazakh speakers with the potential of generative AI.
SHERKALA, is an 8-billion-parameter model that is adaptively trained on 45 billion words, primarily focusing on Kazakh while also including English, Russian, and Turkish. SHERKALA, leverages Llama 3.1 and adapts it for Kazakh, with a 25% tokenizer expansion to make Kazakh understanding and generation more efficient. The model was trained on Condor Galaxy, one of the world’s most powerful AI supercomputers for training and inferencing built by G42 and Cerebras.
“The launch of SHERKALA, reinforces our commitment to addressing the needs of underserved linguistic communities through advanced AI technologies. In collaboration with MBZUAI, we are proud to introduce a model that empowers Kazakh speakers and redefines the LLM landscape with scalable, efficient, and inclusive AI solutions. With JAIS tailored for Arabic speakers, NANDA for Hindi speakers, and now SHERKALA expanding access for Kazakh speakers, we continue to drive AI inclusiveness, ensuring underserved languages are fully represented in the AI ecosystem. This milestone brings us closer to a more equitable future where technology amplifies every voice,” said Dr. Andrew Jackson, CEO of Inception, a G42 company.
“In collaboration with MBZUAI, we are proud to introduce a model that empowers Kazakh speakers and redefines the LLM landscape with scalable, efficient, and inclusive AI solutions”
Dr. Andrew Jackson, CEO of Inception, a G42 company.
SHERKALA sets a new benchmark for Kazakh LLMs by excelling in Kazakh understanding and generative evaluations. The model surpasses larger counterparts through efficient token generation and state-of-the-art conversational capabilities, tested against human-curated queries on Kazakh culture, history, and knowledge. It is the best-performing open-source Kazakh-focused model of its size and outshines 70-billion-parameter models in generative capability.
“At MBZUAI, we are thrilled to collaborate with Inception on the development of SHERKALA a ground-breaking Kazakh LLM. This partnership reflects our shared vision of creating impactful AI solutions for underserved markets. Building upon the success of previous LLMs, SHERKALA represents a significant leap forward in democratizing AI access, preserving linguistic heritage, and empowering communities to thrive in the digital era. Together with Inception, we are transforming the LLM landscape, setting a precedent for innovative, inclusive, and responsible AI development,” said Professor Preslav Nakov, Department Chair and Professor of Natural Language Processing at MBZUAI.