Nine South Asian languages are now on Google’s AI chatbot Bard

The languages are Bengali (spoken in Bangladesh and India), Gujarati, Hindi (spoken in India and Nepal), Kannada, Malayalam, Marathi, Tamil (spoken in India and Sri Lanka), Telugu and Urdu (spoken in India and Pakistan).

Arul Louis Jul 20, 2023
Image
Nine South Asian languages are now on Google’s AI chatbot Bard

Google has added nine Indian languages to its artificial intelligence-based chatbot Bard, enabling it to answer queries and create texts in those languages. It can also sound out the texts in those languages and questions or instructions can be written or spoken out by users in them. 

The languages are Bengali (spoken in Bangladesh and India), Gujarati, Hindi (spoken in India and Nepal), Kannada, Malayalam, Marathi, Tamil (spoken in India and Sri Lanka), Telugu and Urdu (spoken in India and Pakistan).

Bard, which Google calls “an early experiment”, can now handle 40 languages from around the world. 

With journalist Archana Adalja, I tried out Bard in four languages - Tamil, Hindi, Gujarati and Marathi. We gave Bard the simple task of writing in the four languages about “computer software development” and reviewed the output. In all four languages, the voiceover was clear and well enunciated. In Hindi, Tamil and Marathi, it came up with three versions each, two of them outlining the phases of programme development: planning; designing; coding, testing, and maintenance. It added another on how to get training in software development.

The main articles in Hindi and Marathi also had a section on the teamwork required for software development. The two versions of the article on software development in Tamil were written with a literary flair using Tamil words for technical terms like computer (kanini) and software (menporul).

The first time the query was run through Bard, it inserted the word “rewarding” in Roman script followed by Tamil qualifiers when it wanted to say a software career would be rewarding. When the instruction was rerun, it eliminated that. 

However, it switched to colloquial Tamil and transliterated computer and software in the version on how to become a software developer.

Adalja, who reviewed the Hindi, Marathi and Gujarati output, said that the Hindi versions were near perfect as were the Marathi versions, which also had a good flow. (The Hindi versions used English computer terminology transliterated into Devanagari script, while the Marathi versions used “sanganak”, the word for computer but lapsed into transliterated English technical jargons.)

She said that the Gujarati versions, however, were about skills needed to become a software developer and how to develop the skills and get trained. She noted two issues in the main Gujarati version, one was a gender error for a verb and the other was an incomplete sentence.

When the Gujarati words "sangananyantra" for computer, and "prakriyasamagri" for software, were used in the queries instead of the English words transliterated into Gujarati script, Adalja noted that Bard went awry and produced articles on Gujarati literature instead. 

(SAM)

Post a Comment

The content of this field is kept private and will not be shown publicly.