In order to complete this task, I used Speechnotes.
voice to Text
There once was a boy who had a sister that would always for some reason mess up his room while his parents were out of the house. No matter what the boy did or how nicely yes the sister would always be sure to make a mess of his real. What his parents would come home the sister went tell the parents that he had got mad and messed up his room which would cause the boy to get in trouble. The boy I always knew that he had to find a way to prove to his parents that he did not mess up his room. He decided to get a video camera and hide it in his room to prove to his parents that it was his sister. He set up the video camera hidden behind a blanket in the corner of his room. As his parents laughed he went and press record. It was only a matter of time before his sister would come into to mess up his bed and throw around all of his things. As the sister came in she through over his comforter and dumped out his backpack. As she was not looking you moved the video camera from behind the blanket. He said to his sister one day I’m going to prove that you are the one messing up my room and it is not me. Is sister locked and said that there is no way that he can prove that it is her. As she was saying this he went behind her with the camera and said say hi to Mom and Dad. The sister turned around and was shocked. The only thing she could think to do was wave back and say hi Mom and Dad. When their parents came home the brother showed the parents the video to finally prove that it was not him messing up his room but it was his sister.
Analysis
What is wrong with the text, Most common mistakes, and Why?
There are a few things wrong with the above text compared to the story that I spoke. I have edited the story below to show the errors in red.
There once was a boy who had a sister that would always for some reason mess up his room while his parents were out of the house. No matter what the boy did or how nicely yes nice he was the sister would always be sure to make a mess of his real bed. What When his parents would come home the sister went would tell the parents that he had got mad and messed up his room which would cause the boy to get in trouble. The boy I always knew that he had to find a way to prove to his parents that he did not mess up his room. He decided to get a video camera and hide it in his room to prove to his parents that it was his sister. He set up the video camera hidden behind a blanket in the corner of his room. As his parents laughed left he went and pressed record. It was only a matter of time before his sister would come into his room to mess up his bed and throw around all of his things. As the sister came in she through over threw off his comforter and dumped out his backpack. As she was not looking you he moved the video camera from behind the blanket. He said to his sister “one day I’m going to prove that you are the one messing up my room and it is not me“. Is His sister locked looked and said that there is no way that he can prove that it is her. As she was saying this he went behind her with the camera and said “say hi to Mom and Dad“. The sister turned around and was shocked. The only thing she could think to do was wave back and say “hi Mom and Dad“. When their parents came home the brother showed the parents the video to finally prove that it was not him messing up his room but it was his sister.
The most common mistakes made were normally towards the beginning of each sentence. I think this may have a lot to do with how I personally speak as I normally do not use a lot of “power” at the start and end of my sentences which means my microphone may not be able to pick up my voice correctly. I also noticed that there were a few mistakes with words that are similar to other words which again could be due to the volume of my voice and sensitivity of my microphone. I felt the need to focus on annunciating my words knowing that Speechnotes was listening and was trying to convert my voice to text. One obvious exclusion would be proper punctuation as there is no way for Speechnotes to realize where there should be a natural pause in a sentence to add a comma. Furthermore, I did have to manually press a button to insert a period and start a new sentence. Overall though, I believe Speechnotes did a great job of translating my voice to text and I do believe that many of the errors were caused by how I spoke through my microphone.
What if you had “scripted” the story? What difference might that have made?
I believe that scripting the story would have made a large difference in reducing some of the errors of my story. I would be more confident in the story I was telling and therefore be louder and clearer. Reading allows the person to carefully think about what they are saying and not have to worry about using the proper words and language to express themselves. Essentially, they are just sharing the expressions of the writing and not their own. As I was coming up with this story on Speechnotes, I would often have a slight pause or would second guess my wording which would cause issues for the program. However, if I was reading the story sentence by sentence in a fluid way then I believe Speechnotes would pick it up better. This would ultimately improve the results outputted from Speechnotes or from any voice to text software.
Written vs. oral storytelling + other thoughts
Oral discussions and stories are less precise and more casual when compared to written pieces (Gnanadesikan, 2011), and therefore the tone and expression used have a lot of influence on the story itself that a voice to text program may miss. A lot of language used during conversations could also be influenced by those in the discussion due to dialect or jargon. Body language also has a big influence on oral storytelling as we use our body to help explain the story. In a way, our body expression is used to help recreate the story for the person we are telling it to. This creates a large issue as expression and body language do not translate well through voice to text.
As an example, if I told a story of an exciting event that had happened to me then the tone of my voice would express that before I even say the exciting event. This is an aspect of speech that would be completely missed by today’s technology as it only outputs the words and not the expression of the voice. The words outputted from the program would have no meaning without proper punctuation which is something that is also missing. The exclamation mark in the example story above would be used to showcase excitement or surprise. Without the exclamation mark at the end of the sentence, the meaning simply would not be there. It is true that you are able to say “exclamation mark” to add the proper punctuation in these programs, it is simply not natural.
I believe that when voice to text software is able to properly detect and output expression that it will make a huge difference for those who use these software due to accessibility needs.
References
Gnanadesikan, A. E., & Wiley Online Library. (2011). The writing revolution: Cuneiform to the internet (1. Aufl.; 1 ed.). Wiley-Blackwell.
Sam Charles (He/Him/His)
June 13, 2022 — 8:23 am
Hi Joseph,
As you point out, voice-to-text technology definitely has challenges when it comes to interpreting cadence, inflection, punctuation, and tone. There are many humans who also have challenges understanding others even when they speak the same language. Sarcasm and intonation produce meaning on their own that can’t easily be expressed through writing words. It is one of the reasons that written/visual language evolves as Gretchen McCulloch and Helen Zaltzman discuss in “Allusionist” podcast.
It takes a special kind of writer, and some creative use of punctuation, to write in a manner that duplicates how one speaks. With that said, I agree with Gnanadesikan (2011) when he says : “Spoken words, by contrast, are inherently ephemeral. So written language seems more real to us than spoken language” (p.4)
Reference
Gnanadesikan, A. E., & Wiley Online Library. (2011). The writing revolution: Cuneiform to the internet (1. Aufl.; 1 ed.). Wiley-Blackwell.