Category Archives: Activities

Task 3 (Extra): Voice to Text Task Taiwanese Version

Due to Taiwanese being a dying language, I was quite surprised that the voice-to-text application I found had a Taiwanese option. I was quite curious about aspects of this application such as its accuracy and how it would provide an output. Taiwanese is mostly verbal and has very little written text. Mair goes into far more details about it in their article, and I’ll quote the relevant section below.

“If we were to set out to write pure, unadulterated (with as little unnecessary Mandarin admixture as possible) spoken vernacular Taiwanese in characters, well over 25% of the morphemes in a running text would be lacking characters, approximately another 25% would be written with arbitrarily chosen (but more or less conventionally accepted) homophones or near-homophones and concocted special characters, perhaps another 10% would be written with extremely rare but correctly identified benzi, leaving roughly 40% of the morphemes being written with the “correct” characters. In reality, more colloquial styles of Taiwanese would undoubtedly have fewer than 40% of their morphemes written with characters that everyone could agree were the right ones” (Mair 2003).

At any rate, written Taiwanese isn’t something that’s commonly agreed on, and thus, I set out to see what this application would do. After a five minute unscripted narrative, the application automatically translated the Taiwanese and provided Traditional Chinese text with far more accuracy than the English option of the application. While the English version had more errors than correct text, the Traditional Chinese text translated from Taiwanese had approximately 10-20% errors at a glance, and most of them are similar homophones such as “ga4-yi1” (like in Taiwanese) turning into “ga1-yyi1” (the Taiwanese pronunciation of Chiayi, a city in Taiwan). Note that the numbers in the romanization I’ve provided are for tones.

Another major difference between the English task and this Taiwanese activity is that, perhaps due to each character being one syllable, I was able to talk about the exact same topics as the English version far faster, finishing at three minutes, rather than five, so I ended up giving more details of my Sunday plans in that extra minute in the Taiwanese version. This makes me more curious about the speed efficiency of languages, and whether Mandarin for example can translate the same number of ideas in far less time compared to other languages such as English or Spanish. Also, interestingly, I ended up talking about food in my extra time, which may be due to subconscious priming via language, as in Taiwanese the typical greeting akin to hello is asking if someone has eaten yet.

See below for the text output from the yating application when I spoke in Taiwanese.

00:01

這個app很有趣我要,用英語的時候突然,看到這上面有台語的option就是我現在在地的話也直接換,go一出來我覺得這個很有趣,

00:23

用一樣的話說,我這個禮拜的事情好了,我拜託帶一堆遊樂場,差不多三百五十個學生,去之前我們要準備的時候是辛苦的,

00:43

很多學生都不付錢的話,能夠直接出銷貨,結果

00:56

開始就是,然後最後要一直拖我一直拖拖到最後才付錢,所以我沒取消要去遊樂園的事,結果到時候就是我們去的時候都很乖,我們這對學生

01:24

然後他也覺得很有一筆的也是,所以

01:35

我是覺得我,原本我跟老師說這些功夫下次不要再做這件事情,結果學生這麼好玩,我自己也覺得,其實這次的經驗不錯,我們這對老師就開始想說阿,不要考慮要再準備這種事情,所以這次看到是以後不會再

02:05

做這件事情失敗,然後我給我的時候是我,我有一些朋友跟一座深桌遊,不多點結果差不多九點去公園吃個晚餐,然後人生的桌遊是叫做spirit island這個桌遊就是北國的人要來

02:35

知道是神明的東西,結果我們就是要把這些知名的人打回去,跟普通的桌遊不一樣,因為大部分的桌遊人公視是要殖民的,那種事,是要看錢我去找不同的地方,然後去把它將來,結果這是拍

03:08

然後今天禮拜的時候,我現在是打算說,中午要去吃

03:19

一間拉麵你們在吃午餐,順便回到我家,然後我有一個朋友,他在我,我要回去的時候順便載,然後也可以順便幫他把一些東西到應該搬來我們這裡,這個朋友就是我今天有一個新的這個朋友,所以這個朋友要這麼好心帶我,我是覺得很感恩,所以明天我可能就是早上的時候,差不多十點,我是用1個裝備出門,然後去我們附近這家店的時候,我家嘉義市一碗拉麵

04:14

然後一碗,所以是很好的,然後我吃完以後,我的話,我跟我然後我會在我的手機、電腦和我的steam然後我就是在寫功課比學生的作業然後,跟我朋友要從他爸媽去的,時候他還,會載我我明天的行程是這樣

Reference

Mair, V. H. (2003). How to Forget Your Mother Tongue and Remember Your National Language. pinyin.info. https://www.pinyin.info/readings/mair/taiwanese.html

1.4 – Defining Terms

Pre-Reading Definitions

text – a method of communication via writing/words

technology – artificial objects/techniques that enhance human ability and/or efficiency

OED Definition and Etymology

text

“The wording of anything written or printed; the structure formed by the words in their order; the very words, phrases, and sentences as written.”

Same root as textile/texture, woven, style

technology

“The branch of knowledge dealing with the mechanical arts and applied sciences; the study of this.”

Root, technologia, “treatise on the liberal arts…” “systematic treatment of grammar.”

Ngram

Both text and technology has a trend of two periods of much usage before centuries of relative inactivity, before an increase in its usage in the mid to late 20th century. Usage of the word technology had spikes at 1505 and 1536, then had a period of being relatively unused until 1950. Meanwhile, the spikes for the word text lags slightly behind technology at 1533 and 1579, before a slow increase in usage until a sudden jump in 1973.

Also of note is that there has been two time periods where technology was used more often than text: a few years around 1505 and between 1970 to 2000.

Questions

Analyzing etymology is a typical technique in, for example, science class, to help students understand vocabulary. Words such as subscript, subduct, biomimicry, bioluminescence may seem intimidating to a student in junior science until we analyze the parts of the word to gleam its meaning. That said, I find that in the case of text, the etymology was far more surprisingly and while it makes sense in hindsight, it does not provide a lot of new information on the modern usage of the word. Prior to this activity I wouldn’t have associated text with textile, and though there is a logic in associating the two terms (weaving words to create text, weaving threads to create cloth), I feel that due to how often we see the word “text” there is no need to gleam its meaning from its roots, unlike unfamiliar science words such as subscript, subduct, biomimicry, and bioluminescence.