Voice to Text
Below is the unedited text produced from speaking into my phone using the dictation feature within a Google Doc. My “story” is about my first experience with speech to text technology.
This is my first attempt using speech to text technology I definitely have been aware that it’s been around for a while and I’ve heard of how it’s been implemented and teaching and learning. I’ve definitely heard of teachers using it with Yale students as a way just to get their ideas into text form and the benefits that it’s caused for the student. One of the things that I find interesting about this process at the moment is that it actually feels quite stressful because as I’m speaking I can see the words being being written across the page and I should have a heightened awareness of the mistakes that I making as I’m speaking. In an odd way this kind of creates an extra pressure that just for hitting it simply into the computer would not I am also aware of where this text is going to end up it’s going to add up on a public blog and generally we want we want what we say and what is written to be a good reflection of who we are. I think that not having this being a scripted assignment it’s definitely makes it harder just a free flow talk and even if you do imagine that you’re talking to a friend. Even though I’m taking the time to think I think as I go and I’m realizing that this will probably be something that I had it out when it comes time to actually actually edit it so one of the things that I’m noticing as I’m typing this is that a few times the dictation is just stopped and I’m not sure why that’s happening and then I have to start it again I guess it’s something that I would have to explore a little bit later on. One of the things I’m also noticing is that as I’m going the volume of text is quite large and also sort of impressed with what I’m seeing in terms of accuracy. I do however see that there’s some form of auto correct making changes as as it goes again seems to be quite high. Just stopped again for some reason so I’m restarting I’m just going to press the dictation button and I’m starting again continuing where I left off I should say I don’t know. I don’t think this is going to be the most interesting thing for anyone to read. One of the changes I would do next time when I’m recording my voice to text I will definitely have some kind of outline but I then again I don’t know if that counts as being script because then you’re writing it down I think the. I have. Right now I’m laughing because I told her to take her. And it did for a second and it deleted it. So now it seems to of stopped working as well cause I’m starting to receive WhatsApp messages and I don’t know if that’s affecting what’s what’s happening on the screen here or not but I did stop again for some reason. I can definitely see uses in terms of casual language in terms of typing emails I think they can see this is quite high so I am I could easily do this in a Google dock as they currently only doing and copy and paste it into an email so I definitely do see the value in us for sure. I have about one left and I am starting to you know feel that the time crunch of trying to keep this going and I’m sorry if you are reading this because it really is just me having streamStream of consciousness thought. All right folks looks like we reach the five minute mark so it is time to draw this if we’re generous we can call it a story to America to the end. Thanks for reading.
How does the text deviate from conventions of written English?
Although I believe the technology has the capabilities to do so, my lack of knowledge and experience with dictation within a Google Doc lead to a lack of paragraphing and punctuation errors.
While not always true the language of speech is less formal when it is not scripted and in speaking and exchanging ideas we do not need to be as formal (if not recorded). Written work since it is permanent to a degree demands a level of formality as it is a representation of ourselves.
What is “wrong” in the text? What is “right”?
There are a lot of blue squiggly lines in my text telling me that there are a lot of grammar errors within my text. The software added some words I did not say partly because I either mumbled or started a new thought (which in writing I would have edited out). Some words that were meant to read as something else were recorded differently for example “EAL” the software wrote as “Yale”. The meaning between the words is quite different and shifts the meaning of my text entirely.
I said “ummmm” a lot but the software does not record this despite me saying it. This is not accurate so possibly wrong, but I like this feature and would deem it right in the conventions of written language as a pose to spoken language. Since the goal is written text I would say this feature is correct for the intended form.
What was right is that for the most part, the software was able to accurately record the words I spoke.
What are the most common “mistakes” in the text and why do you consider them “mistakes”?
The most common mistakes in the text seem to deal with punctuation. In spoken word, there are cues to pick up on such as tone, inflection and pace which allow the audience to know where ideas begin and end. In written language these same cues are absent and without them, the text becomes an intimidating blob.
I would view the lack of punctuation in the text as mistakes in the sense that it does not present a good representation of myself or my ideas. Again I recognize that the software may be capable of doing this and the fault may entirely be user error.
There are also parts of the text that are not meant to be there (extra words added as I was thinking and I would have erased if allowed to).
This text would present as a rough draft to most people and they would likely infer that I am lazy or apathetic if I presented this dictated text as a finished product. They would also likely make judgements on my level of education based on the errors present in my writing. Gnanadesikan (2011) notes this very real value judgment society places upon our writing as it is associated with education and thus with status (p. 5). The fact that writing is “a process of translating time into space” ensures a higher degree of pressure on our written words which can be revisited and judged at any time as our words occupy space even when we do not (Gnanadesikan, 2011, p. 3). As such I would like to believe that an audience would not judge me as harshly if they heard the dictated text delivered in person verbally.
For future classes I wonder if it would be helpful to have a recording of the spoken word to compare to the text (not just the writing)?
What if you had “scripted” the story? What difference might that have made?
If I had scripted the story there would have been more creative elements present infused with author’s craft techniques. I may have tried to position it as a battle between me and technology and describe more emotional details. I could include thoughts and dialogue, explore more in terms of setting and plot. Obviously, none of this happened. These kinds of thoughts don’t just pour out of my mind at will, I need to plan them and reflect on them. I need to revisit what I’ve written and take the time to edit what I have. I would not feel the same pressure as I did watching the little lines on the audio recording bar move and the clock ticking down. My mind was elsewhere and focussed on other things, not the story itself. The story from the dictation is more a documentation of the task as it unfolded rather than a story I planned. That said I had many thoughts I planned in thinking about the assignment but when it came time to record my mind went blank so I spoke about what was happening in the present.
In what ways does oral storytelling differ from written storytelling?
Oral storytelling allows for a performance aspect to take place. You can bring characters to life through your voice (tone, inflection) and other elements of characterization (gestures if the performance is live). There is an added dimension of interpersonality that exists. The story becomes more communal between the storyteller and the audience. There is a mostly silent negotiation between the perception of characters (how the storyteller brings them to life and how the listener perceives them). In reading a story the task of bringing characters to life is more solely the responsibility and in control of the reader. The writer is not present when the reader reads the story so the creative control is relinquished. This is not necessarily true with an oral story.
Works Cited
Gnanadesikan, A. E. (2011).“The First IT Revolution.” In The writing revolution: Cuneiform to the internet. (Vol. 25). John Wiley & Sons (pp. 1-10).
Hi Chris,
I have enjoyed reading your reflection, particularly the section on pauses in your speech as well as the possibilities of prerecording a scripted story.
I used MS Word’s dictate feature to record my 5-minute story. English is not my first language and I have mispronounced a couple of words that have led to spelling mistakes in my text. Also, when I was pausing, I was making a sound “aaa” which is a typical interjection in the Ukrainain language. It made me wonder if bilingual speakers control interjections unlike other parts of speech or use them subconsciously. It has been interesting to read that your device has not picked up on your “ummm” words and sounds.
In regards to your reflection on the scripted story, Gnanadesikan’s (2011) in her article says that “Written down, words remain on the page like butterflies stuck onto boards with pins. They can be examined, analyzed, and dissected. They can be pointed to and discussed. Spoken words, by contrast, are inherently ephemeral. So written language seems more real to us than spoken language” (p. 4). Do you agree with this statement?
Hi Nataliia,
Thank you for taking the time to comment on my post.
I’ve always considered “ummm” as a thinking word. Something uttered when I am processing my thoughts. I often wonder why I can’t just be silent in that instance. But, it seems to be something I’ve become conditioned to do as have many of the students I teach. It certainly feels subconscious.
Great question and quote. I can certainly agree that spoken language is more fluid and responsive than written words. The written words as Gnanadesikan (2011) points out in the quote are to be “examined, analyzed, and dissected” but they are not alive; they can not respond or interact with the reader (p. 4). They can however be revisited and be used as fact. The spoken word as ephemeral (if not recorded) offers a lack of physical proof of the utterance which in turn makes it feel less secure, less permanent. So, yes I do agree with the statement as often what we term as real is that which can be proven.
How about you? Do you agree with Gnanadesikan’s quote?