Task 3: Voice to Text

For Week 3 of the course, we are tasked with sharing a five-minute unscripted anecdote using a voice-to-text tool. In doing so, I’ve decided to describe the events of my day on January 25th using the built-in dictation functionality in my phone’s notes app:

OK I’m trading this audio notes using the text to speech feature of my notes app on my iPhone. As part of my task for a E tech 540. I thought I’d share a bit about my day and what I got done at work so today I got up around 7 AM to get ready to go to work. In that time I took a shower fed my cats got dressed and pack my bag which would’ve received in my first task. I left the apartment by 745 and Scott do the bus stop shortly after it took me about 40 minutes to get to the office today which was pretty good considering considering that it takes me usually 45 minutes to an hour. once I got to work, I check my email checked my Teams messages, and instead of drafting a few responses. Once I finish that I went to go make myself a coffee. I usually drink my coffee black but lately I’ve been taking it with oat milk. And then once I finish making my coffee, I got ready for a meeting with a Subject matter expert to review a script for a video tutorial for one of our library. Search tools. That meeting took about 3030 to 40 minutes. Once I finish that I got some editing and voiceover work done on a virtual tour for one of our maker spaces. And then I went for lunch around that time my school put it out an announcement that we will be closing early due to snow storm. After lunch, I finish responding to a few emails and started doing a bit of research for another project I’m working on. I then got up from my desk to go make a tea and noticed at that point the snow is looking pretty bad. Around three i left the office for the early closure. And then walked over to the bus stop that I usually take to get home. The snow was pretty bad and the traffic is moving pretty slowly as the road conditions are pretty bad. I’m at such a took me a bit longer to get home and I think it was about an hour and a half but once I got home I put on some dinner FaceTime some friends watched a bit of TV and then got some schoolwork done and I’m about to pack my lunch for tomorrow and start getting ready for bed. And I think that’s all I have to share about my day, and I also think I’m hitting the five minute mark. If you made it this far thanks for reading.

Reflection

How does the text deviate from conventions of written English?

While transcribing my unscripted message into a written format, it became evident how train-of-thought my verbal composition is, when compared to my use of written English. This was somewhat expected, given that the task called for an unscripted and informal message. However, the spontaneity of my word choice and narrative structure runs counter to the intentionality and conciseness of written text, as characterized by Gnanadesikan (2011, p. 5).

In addition to the narrative structure, some of the messaging is distorted or lost in the conversion from voice-to-text, given that my phone didn’t always pick up on my pauses between sentences and paragraphs, the cadence of my speech, rising and falling inflections on certain words, choice of words, and so on. Save for a few periods and commas automatically added by the software, the text largely deviates from the established grammatical rules of written English.

What is “wrong” in the text? What is “right”?

It’s fascinating to reflect on the distance between the two systems at play here to encode my story into written form. One is the use of writing as a system of codified rules and symbols (Gnanadesikan, 2011). The other is the technology which translates the written word (in my case, the microphone and text-to-speech software on my phone) and brings it into what Haas (2013) describes as “the material world” (p.3)

As I read through the text, I notice some obvious grammatical issues, including a lack of paragraph breaks, missing punctuation between sentences, and missing capitalization (e.g. for personal pronouns). While some punctuation was automatically added by my phone, it seemed that for best results, it was largely up to me to manually dictate the punctuation. I rarely use voice-to-text to write, and as such neglected to do this in the moment as it didn’t feel natural for me to verbally communicate in this way. The resulting output is a text ridden with unbroken sentences and improper pacing.

Some word choices were also distorted in the text. Reading through, I noticed that my phone replaced several words with similar sounding words (which had no meaning in the context of my story) or words with the incorrect participle (e.g., “packed” = “pack”). Acronyms were also not captured (e.g. ETEC = “E tech”), likely because I did not read out the acronym in letter-form.

Otherwise, I thought the software on my phone did a reasonable job of capturing the content of my my speech. At best, and despite it’s flaws, the text might achieve it’s functional purpose of communication, which Schmand-Besserate (2007) suggests is one of the primary goals of written languages. There may be sufficient context-clues present within the text that one could compensate for the grammatical and word errors and construct some meaning. Yet, as Gnanadesikan (2011) suggests writing is “a process of translating time into space” (p. 3) and its visibility is not a prerequisite for its interpretation. For someone accessing the text using a screen reader, which reads back text in a linear way, writing becomes much more dependent on time and sequence. I look forward to further exploring the materiality of language that Haas (2013) refers to and how it affects or informs culture, systems of power, and how language is accessed, etc.

What are the most common “mistakes” in the text and why do you consider them “mistakes”?

As described above, the most common mistakes that I observed was a lack of proper punctuation, use of tense, and word choice. I consider these ‘mistakes’ based on my prior knowledge of established grammar and language rules, my understanding of the English vocabulary, and the original meaning of the words I consciously selected while composing the message.

What if you had “scripted” the story? What difference might that have made?

Had the above text been pre-composed in a written format before reciting it into my phone, I believe that the story would be more precise, organized and engaging to read. This could be attributed to the notion that written languages builds upon or enhances oral communication (Ong, 2002, p. 9). Furthermore, I find myself in a culture where oral languages are not the primary mode of communication (Ong, 2002), given that I depend on and use written languages on a daily basis to communicate with others.

As such, the choice and number of words used to relay my story would likely look different. Words and whole sentences would be taken out, re-ordered, or consolidated into something more concise and accurate. Additionally, the text would be more grammatically correct, as I would likely be more conscious to write-in any punctuation and paragraph breaks before reading it out loud. The end result would be a written text with refined pacing and clarity.

In what ways does oral storytelling differ from written storytelling?

As Gnanadesikan (2011) suggests, some qualities of oral speech do not always translate the written word, including “intonation and emotional content” (p. 3). This was evident to me while composing my voice-to-text story, as I noticed that any changes in pitch I made to emphasize something or denote a question was not captured; immediately disconnecting the content of the text from it’s emotional context. I could, of course, attempt to bring the emotional gap between my speech and text by adding bold or italics to emphasize certain words, or a question mark at the end of certain sentences.

References

Gnanadesikan, A.E. (2011). The first IT revolution. In The writing revolution: Cuneiform to the Internet (pp. 1-12). John Wiley & Sons. https://doi.org/10.1002/9781444304671.ch1

Haas, C. (2013). The technology question. In Writing technology: Studies on the materiality of literacy (pp. 3-23). Routledge. https://doi.org/10.4324/9780203811238

Ong, W. J. (2002). Chapter 1: The orality of language. In Orality and literacy: The technologizing of the word (pp. 5-16). Routledge. https://doi.org/10.4324/9780203426258

Schmandt-Besserat, D., & Erard, M. (2007). Origins and forms of writing. In C. Bazerman (Ed.), Handbook of research on writing: History, society, school, individual, text (pp. 7-26). Routledge. https://doi.org/10.4324/9781410616470

Leave a comment

Spam prevention powered by Akismet