close

KU Completes Google-Funded Trilingual AI Translation Project for Nepali, Tamang and English

Techpana Techpana

असार ३, २०८३ १३:३९

KU Completes Google-Funded Trilingual AI Translation Project for Nepali, Tamang and English

 

Kathmandu. The ‘Google Trilingual Machine Translation (TMT)’ project developed by the Information and Language Processing Research Lab (ILPRL) at Kathmandu University was formally launched on Monday.

The project aims to enable seamless machine translation between Nepali, Tamang and English. It was selected in 2024 among the top 33 global projects under Google’s Society-Centered AI Academic Research Awards.

Led by researcher Bal Krishna Bal, the project was developed over a period of around one and a half years and is designed for both academic and community-level use.

According to project lead Bal Krishna Bal, the system supports translation across five key domains—agriculture, health, education, mass communication and tourism. A dataset of over 100,000 “gold standard” parallel sentences has been prepared across the three languages, with 20,000 sentences contributed from each domain.

Technically, the system was built by fine-tuning Meta AI’s NLLB-200 (No Language Left Behind) model, which supports 200 languages, using Nepali and Tamang datasets. It enables translation in six directions, including Nepali to Tamang, Tamang to Nepali, and English to Tamang.

The platform can be accessed at tmt.ilprl.ku.edu.np, where users can input text and translate it into any of the three languages. Users can also rate translations, while linguists can verify and improve dataset accuracy through feedback.

The system also includes browser extensions for Google Chrome and Firefox, allowing users to translate web content, including government websites, into Tamang. It further supports document translation and offers human editing options to improve accuracy.

A crowd-sourcing feature allows users to contribute translations through a computer-assisted translation (CAT) tool.

Researchers say the initiative could play a key role in reducing language barriers in public services, particularly in provinces where Tamang and Nepal Bhasa are being considered for official use alongside Nepali.

The project’s long-term goal is to include Tamang on Google Translate as the fourth Nepali language, alongside Nepali, Maithili and Nepal Bhasa (Newari). Continuous coordination is ongoing with Google’s technical team for dataset integration.

Future updates are expected to include legal and judicial terminology, support for the Tamang script, expansion to other minority languages, and voice-based translation features including speech-to-text and text-to-speech.

The project has been developed with contributions from researchers and linguists including Bal Krishna Bal, Balaram Prasain, Dr. Prakash Poudyal, Amrit Yonjan Tamang, and Indra Tamang from Tribhuvan University’s Central Department of Linguistics.

पछिल्लो अध्यावधिक: असार ३, २०८३ १३:३९