Tag Archives: Speech recognition

M3 P3 The Race to Save Indigenous Languages, Using Automatic Speech Recognition

This article is about the work being done by Michael Running Wolf, who is a clinical instructor of computer science at Northeastern University’s Khoury College of Computer Sciences, on developing methods for documenting and maintaining Indigenous languages through automatic speech recognition software. This work is a precursor to his long term goal of providing a way for Indigenous youth to learn their language by way of technological immersion, using technologies such as virtual reality or augmented reality.

Part of the difficulty of developing automatic speech recognition for Indigenous languages is that in the field of computational linguistics, relatively little research has been devoted to Indigenous languages. An additional challenge is that many Indigenous languages are “polysynthetic” meaning that they have words that contain many morphemes, or units of language that cannot be further divided. As Michael Running Wolf points out, “polysynthetic languages often have very long words – words that can mean an entire sentence, or denote a sentence’s worth of meaning.”

 

https://news.northeastern.edu/2021/10/08/protecting-indigenous-languages-using-automatic-speech-recognition/

Module 3 Post 1 – Indigenous Language Speech Recognition

Te reo Maori Speech Recognition: A Story of Community, Trust and Sovereignty 

The work that the Maori people have done over the years to preserve their language is truly an inspiration. The Maori along with Hawaiians have been leading the way in Indigenous language revitalization for a very long time. This is another example of how they are leading the way and continue to be an inspiration for many people working in Indigenous language revitalization.

Speech recognition software

Te Hiku Media which is a charitable media organization, collectively belonging to the Far North iwi of Ngāti Kuri, Te Aupouri, Ngai Takoto, Te Rārawa and Ngāti Kahu has adapted existing open sourced speech recognition software to understand the Maori language Te Reo Maori. This type of work is essential to developing virtual worlds where people can learn Indigenous languages. For example, if a virtual person in a metaverse type of environment was programmed to understand an Indigenous language with the speech recognition software, and could in turn speak back in said Indigenous language, a person could practice speaking in a virtual world as much as they wanted.

Data Sovereignty

Data sovereignty is another very important topic that is touched on in this video. Kaitiakitanga License is a license that Te Hiku Media is working on in order to protect their data. Their goal is to have only Maori led organizations and initiatives have access to their data, at least initially. They would also return a portion of profits made from the data back to the communities from which the data came.