Voice to Text

My Story

This is a story about my new house 10 years ago my parents moved this neighbourhood and it was love at first sight and I was bummed that it wasn’t the ones my parents had purchaseda few years later it did go on the market but unfortunately I was in the middle of my undergrad and couldn’t possibly buy it myself at that time jump ahead to the summer of 2017 when the wildfires hit the Cariboo chilcotin this neighbourhood was the first in the northern Caribou area to burn three houses were lost two of which are right across the street from my new house I was there for most of it I’m at my parents house my dad and I were fighting bushfires grass fires root fires and watering our neighbours gardens it was a very strange time because while we could legally be home during the evacuation order we could not return if we left the property otherwise we would be charged with insanity dad and I especially took care of this house because it was my dream home whether or not I would ever get to live in it jump ahead 2 after the fires when I met the previous owners and jokingly said to them that if they wanted to move away for retirement to let me know because I I would love to buy their houseturns out they were planning to retire sooner rather than later and they gave us a timeline of about 5 years which was shortened to a year which was shortened to a matter of months we got to go inside for the first time in October and it was eerie because it’s floor plan was identical to every dreamhome I ever built for myself in The Sims miraculously my partner and I were able to raise the down payment and actually get a mortgage and now we’re here and it’s crazy and I’m so overwhelmingly I know it’s corny but I’m so overwhelmingly blessed I can stand on my porch and if I was a good spitter and the wind was blowing in the right direction I would be able to spit on the the tree that is half burnt so not only am I lucky that I get to live in this beautiful home with its beautiful views I’m lucky that this house is even still standingand part of that luck is self-made because I stayed when so many others left the Caribbean I helped protect this house and others in this neighbourhood PS I was raised with fire training and my dad was certified by the US Navy to fight fires and had been a bushfire volunteer for decades I understand that people with less fire knowledge felt safer following the evacuation order and leaving for Prince George in Kamloops during the 2017 wildfires

– Recorded using Google Keyboard’s Voice-to-Text button

Analysis & Reflection

You likely noticed that my text has a glaring deviation from conventionally written Englishthere is no punctuation. It’s like reading something by James Joyce (albeit with far more “and”s). It is utterly without any markers to signify pacing… or emphasis. These are things that Google’s voice-to-text algorithm cannot interpret (yet).

It got some other things “wrong” too. Contrary to grammar-rules, numbers under 10 were written as digits, rather than spelled out. Sometimes it recorded “2” instead of “to”, or “in” instead of “and”. Occasionally there is no space between words, which was a result of the tool stopping while I was speaking. However, it almost always recorded “it’s” and “its” correctly, so it seems to have the ability to interpret context. It also correctly capitalized place names (except for “chilcotin”) and acronyms (“PS” and “US”).

The most common mistake was misspelling Cariboo. The algorithm actually got it correct oncewhen it was followed by its sister region, Chilcotin. Unsurprisingly, when mentioned alone it would record “Caribou” (the animal), and when followed by an “and” it would record “Caribbean” (likely due to how I breathed between the two words).

Dictating to my phone fluctuated between convenient and cumbersome, with very little in-between. In some ways, it took less time to tell my story than it would have to write it down. It was essentially running thought-to-speech. Ironic, since I would have crafted a much smoother narrative had I scripted ahead of time. I often edit as I write, correcting errors (such as my stuttered “I I”) and reading passages aloud to myself to feel its flow. Having a script would also have saved me from the frequent interruptions to my thought process! More than once I needed to repeat a large chunk of the story because the tool had simply stopped listening.

Voice-to-text can be a useful technology for those who, whether by ability or circumstance, cannot write. I sometimes use it to blindly send a text when I cannot look at or physically use my phone. A student could use it to record their own stories and other written assignments. Depending on their educator’s expectations, they may need to do a great deal of editing afterward. Whether voice-to-text technology is being used for convenience or capability, however, it is not the same as oral storytelling.

Oral storytelling has many advantages that make it different from writing. Storytellers can use non-verbal communication, such as gestures, expressions, and movement to captivate and entertain their audience. Like Christine De Luca, they can vary their intonation, raise or lower their volume, and even switch dialects. Writers must literally spell out these things. For example, Brian Jacques wrote a variety of English dialects in his children’s series Redwall. Google’s voice-to-text technology is not capable of translating these oral storytelling techniques (yet). By dictating to a device that cannot interpret emotion, or give its own responses as an audience, voice-to-speech is more akin to writing with a magical pencil than to oral storytelling.

In 2015, I was in TRU’s Indigenous Literature course, and we explored oral storytelling a great deal. Much of an oral storyteller’s skill stems from their ability to read and respond to their audience. For example, in Netflix’s Locke and Key, the three siblings recall how their father told them the same bedtime story. He paid attention to their reactions and tailored the story to each individual child. In this way, oral stories are more flexible than written stories—where the best you can do is skip ahead, as the Grandfather does while reading the “kissing part” to his Grandson in The Princess Bride. According to Chris Bose, a poet of the N’laka’pamux (pronounced ng-khla-kap-mh) Nation, who made a series of videos for my old Indigenous Lit. course, stories need an audience to come alive.

I miss teaching my students face-to-face. I miss the “dance” of sharing stories with them. In the past two months, I have made several videos to facilitate asynchronous learning. You can still spin a yarn while your audience sits in your imagination, watching you through the lens of a camera. Despite the text being spoken, rather than written, I still find I use the same editing-as-I-go process while filming. A lesson that would take twenty minutes takes me several hours to film a video that is less than ten minutes long. Everything you say has to be measured, concise, and enunciated. Like writing, once it’s published it’s fixed. This is not oral storytelling. It’s not fluid, or adaptable to the reactions of my audience. Instead of feeling recharged by receiving the reciprocal energy, I feel drained by performing for the Void.

Read 8 comments

  1. Hi Laura,
    I feel your pains of teaching through video! It’s such a bizarre experience that seems to focus on transmission rather than a natural conversation in the classroom. Not being able to adapt to the audience or communicate with body language is a short coming of the experience.

    However, video is nice in other ways because it forces you to be well organized from the beginning and really think about the purpose of the message and the audience. The text is shaped to be scripted.

    Do you teach your lessons like a filmed version of you teaching or as a narrated screen capture?

    • Hi Linda,
      I agree that videos force you to be organized! We’re forced by upload-length limits to really consider our student’s attention spans.

      I do a mix of both. If I am teaching anecdotally, I will film myself. But if I am trying to demonstrate something, or go through the details of an assignment, I will use screen-capture. Not sure if you grabbed it yet, but we get Camtasia free through UBC and it has been a lifesaver for editing this. I also do some top-top videos (such as a demo for how to layer stickers).

  2. HI Laura,
    I’m curious – you said you sometimes send text messages using voice-to-text – does is go better with shorter sentences or messages?!
    I also really like that you left out punctuation all together, and reflected on the fact that voice-to-text cannot yet interpret pauses and phrases on its own. Trying to put punctuation commands into my speaking was a very distracting process, and meant that I wasn’t quite talking and wasn’t quite writing… just hovering in the uncomfortable grey area in between.
    Mentioning audience was also a good idea, as I found just speaking non-stop for 5 minutes without prompts or responses quite challenging. “stories need an audience to come alive” and apparently a conversation does too.
    Thanks for all these great idea prompts! Lots left to muse 😀

    • Hi Jamie! You’re welcome 🙂
      I definitely only use voice-to-text for shorter messages! Things like “on my way”, or “off train, be there in 5” when I’m trying to watch where I’m walking. Longer sentences would probably work out as well as my story.
      That’s really interesting that you changed your speech in order to punctuate. Honestly, it didn’t even cross my mind! Did it ever put the word for the punctuation, or did it always get it right? I wonder how it would handle a phrase like “in the Triassic Period”!

  3. Hi Laura,
    I can really appreciate your need for an audience. There is definitely an energizing that comes with storytelling where you can feed of the emotions of yourself and your listener. This experience was neither a true oral story telling or a true writing experience but a good way to learn to appreciate the affordances of oral language. Voice to text does have some uses especially for students who cannot write or type (sometimes the keyboard search or the spelling is too much). Even misspelled and lacking punctuation is better than none. Have you had a student reluctant to try voice to text? Some of our students refuse and try to rely of scribing. I wonder if just a voice recording might be more reliable. This whole exercise did leave me with an appreciation for those conventions that allow translation to written language, however.

    • Hi Rebecca,
      You make a very good point about how some students are challenged by using a keyboard. None of my students use voice-to-text, but I’m not sure if its due to reluctance or uncertainty. Every time they log onto the computers in the lab, the fist thing to pop up is the “Read and Write” program, which is supposed to help with just this. However, mine do what some of yours are— dictate to myself or an EA. To be fair, however, I do not have headsets with mics in easy-to-acess locations (they’re on the wishlist for when I get a budget).

      I wonder how well voice-to-speech would work in a language that doesn’t use a phonetic alphabet, like Mandarin. Would the speaker have to be that much more precise in their articulation?

  4. Hi Laura (catching up on the student blog’s).
    Really great story. I too had the same problem with punctuation. I used the Word dictation function. After I did the assignment I then dictated with punctuation (much like I have done at work when dictated for an assistant to type something). That worked. It is a very different exercise – although I do find it easy to dictate with punctuation – it seems to be a process that works for me

    I also appreciated your discussion on oral storytelling. It is such a different process – with tone, timing etc. Also, I think the issue of having an audience is different from speaking to a neutral tape or dictation machine.

    This voice to speech is to me a very important future system. And the current frameworks will have to be improved to be effectively used in educational learning systems. (I have taking Constructivist Learning right now where the topic of such digital speech is important).

    Pat McLean

    • Hi Pat! No worries, I need to do that as well! It’s nice to be reminded of things I worked on earlier in the course, now as we near the end. 🙂

      It’s interesting that you had an easier time dictating punctuation. I wonder if my blatant disregard for it is because I’m a phonetic speller. I don’t think of punctuation so much as I am mindful of the pacing I want to encapsulate in my writing— something my fingers know how to mark by reflex, but I may forget the oral-terms for. Are you a phonetic-speller as well, or do you see the words as you say them (and thus the written structure).

      Voice to speech seems to be one of the most likely routes for text-creation if it is to evolve “beyond” the pen. I signed up for Constructivist Learning next January, how is it???

      -Laura

Leave a Reply