Task 3: Voice-to-Text Task

When I saw the description for the assignment, I was very intrigued and excited by it! Coming from a COGS/Linguistics background, I love exploring the intersection of language and technology!

The assignment reminded of a conversation I had with someone that I went on a date with. Below is a recollection of that conversation and what I found insightful and  relevant to the questions being asked in our analysis. The text is transcribed via the voice-to-text software Speechnotes.

In LING100, a concept that was really emphasized was the difference between descriptive and prescriptive grammar. Descriptive grammar is the studying and describing of language as it is used including “standard” and “nonstandard” usage; prescriptive grammar on the other hand is specifying how language “should” be used, often seen in academic settings aiming to teach how to use language “properly”. I wanted to point out the distinction, as I will be using the prescriptive perspective to analyze how the transcript deviates from written English conventions.

Testing testing 1 2 3 so last week I went on a date with someone they are cleared to say slam Poet we it was funny day we decided to meet up in a park on Commercial Drive I got my picnic blanket and some snacks and we just sat in the Sun and chatted for a while my experience with slam poetry is very limited I've been to some performances before hosted by UBC slam Poetry Club that was when benny's Bagels was still around and that was the venue they usually hosted of these Open Mic that so it was really interesting talking to them about accessibility and slam poetry since they are I think immunocompromised and with a lot of venues not adhering to covid measures it's very difficult for them to be able to go to these events feeling safe and it was something that I didn't realize like or put too much thought into how unaccessible these events were and so they talked about how there were some some poetry events that were hosted on now that things are opening back up they're going back in person and so when I asked them about their experience with performing slam poetry virtually in like commercial spaces they didn't really have a very good experience I think mostly because of technical issues for example like you can't really hear people feedback as immediately there's some latency or free sample you can't really see people's Expressions when you performing and I think that also affects like you know how how you how you feel while performing so they have been mostly trying to convert their some poetry soap forms of like we're like more oral forms of poetry into more written forms in zeins and Chad books I found that really interesting because of such a big difference in Media and they provided very interesting perspective because with like sin poetry lot of it is just the emotional aspect of it while performing but also so much information is lost when you convey it into a written form free samples of cadence you know the way you stressing the words your internation your infection stuff lost when you try to put that on paper though they've also given me a different perspective on like while they were trying to make their poetry into more readable format there was from ways of expressing that for example free sample like the way you put the lines on paper like I guess like formation structure punctuation could be one so very much I felt that you really needed to work with the limitations of the medium which was something I never really considered until they were talking about like yeah this transition of putting something that is mostly spoken and sort of like formless and that way into something that you can see and read on paper

Deviations

  • Punctuation is the first visually notable thing is the lack of punctuation, specifically to mark pauses in speech and sentence beginnings and ends. I realized that Speechnote requires you to enter them manually by saying “period”  or “comma” to type them in.
  • Capitalization is sometimes correctly identified, mostly for proper nouns. The software correctly capitalized “Commercial Drive”, “UBC”, “Poetry Club”, but overgeneralized for “Sun”, “Bagels”, “Open mic”, “Expressions”, “Chad books”, “Media”. There was also instances where it didn’t capitalize properly, for example “benny’s”, “covid” . Of course, since there are no marked periods, there is no capitalization of the next sentence start.
  • Spelling/Accuracy of transcription was mostly accurate, nothing too ridiculous or unreasonable, and mostly captured the majority of what I had enunciated properly, even my own mistakes (!!) such as mispronouncing “inaccessible” as “unaccessible”.  It was also interesting to see that it transcribed my “testing testing one two three” into numeral form of “1 2 3”. There were also instances where they misspelled what I had meant, such as “free example” (for example), “Chad books” (chapbooks) , “zeins” (zines), “internation” (intonation). Sometimes it didn’t complete the entire word like “hosted on(line)”. On the other hand, it was able to capture contractions correctly like “I’ve” and “didn’t” as well as possessive cases such as “benny’s”.

I think a lot of these tiny mistakes can be contributed to questionable quality of articulation/enunciation or spelling conventions. For example:

  • Chad books vs chapbooks: as the [d] and [p] sound are both plosives, where the airflow from the lungs is suddenly interrupted by closure of the mouth. The [p] is a voiceless aspirated plosive, so it sounds very whispy, which sometimes gets misheard even by the human ear.
  • zeins (/ziːənz/)  vs zines (/ˈziːnz/): the difference in pronunciation is very difficult to hear clearly in passing conversation and can sound similar. Also English spelling conventions are weird, like the famous example “ghoti” as alternative spelling of “fish”.

If I had a script prepared beforehand to present to the voice-to-text software, the first noticeable difference would be the lack of filler words like “like”, “yeah”, “I think”, which oftentimes are meaningless sounds to mark pauses, hesitation, or used to stall for time. I think the script would feel more structured and have a better flow, rather than feeling like I was saying the same sentence just in a different way and rambling along.  I would feel more confident and clear when I “deliver my speech”, which might result in decrease of the above mentioned mistakes, especially for spelling and accuracy of transcription.

 


Reflection

Lastly, I have many thoughts about oral storytelling and how it differs from written storytelling in the context of slam poetry. These thoughts were formulated after reflecting more deeply on the conversation I had with my date and their experiences.

There are two aspects I wanted to explore, in relation to the process and completion of the assignment exercise: the transition of in-person slam poetry events to online virtual events, and the translation of spoken word poetry into written poetry form.

Transition from IRL to the Web

Poetry slam is a competitive event where spoken word poetry is performed in front of a live audience. Often, it involves cheering, snapping of fingers, and other forms of participation. The reaction and feedback creates a relationship and interaction between audience and the performer, in addition to the poet and their own poem by virtue of speaking (in) their own voice.

During the pandemic, many of these open mic spaces transitioned online, yet the experience was drastically different. Besides the technical issues of the interface (i.e. muted microphones, audio issues, video-call latency, etc.) an integral aspect of of audience participation was lost. There was no immediate way for the poet to gain feedback as they were performing, no body language to show engagement, no eye-contact to show “yes I am paying attention”, no snapping of fingers to show appreciation. It was difficult to use the story-telling process of slam poetry to create a connection with the audience.

This was somewhat like the process of speaking to Speechnotes “as if simply talking to a friend and telling them an anecdote” as per assignment instructions — It did not feel like talking to a friend at all.
There was no feedback from my “friend”, no head-nods, no eye contact, nothing to show that they were interested in listening to what I had to say. There were no questions being asked about the details of my date, no gossip about whether or not I thought they were cute, or any back and forth to draw out more information in a mutual synergetic interaction of the “storytelling” process.

Translating the Spoken onto Paper

Now that many restrictions have lifted , more events are returning to in-person spaces, with little to no safety practices in place. which makes these events unsafe and inaccessible to many in the immunocompromised and disabled community.  This was one main challenge that my date faced , as they felt unsafe to attend poetry slams in-person in general. Therefore, they were trying to make more “readable” versions of poetry to put in zines and chapbooks.

Yet, there were many nuances that got lost in translation– intonation and cadence of the voice, rhythm of the words, stresses in the syllables — how can one accurately translate all of these sonic expressions into written form?*

Nevertheless, in the process of making their spoken word poems more readable, they started experimenting with line breaks, spacing, and even changing the line lengths to accelerate or slow down the tempo of the poem. Though, it was a frustrating translation process, it also pushed the need to be more experimental and innovative with the ways one can use words on paper.

This made me reflect back on Walter J. Ong’s lecture and what he says about the emergence of new mediums and how it enforces and changes the old mediums. In this case, literacy in both mediums is necessary to use them inextricably to overcome the limitations of the previous medium in which you are working with.

To conclude, I found this assignment really engaging, especially being able to think about “language as technology” as situated  in my everyday life and “deliberately” experience it first-handedly in the way I had over the weekend. This made me reflect on the diversity of mediums we now have, and the necessity of being “literate” in them in order to keep up with the changing landscape of language and technology. At times, it feels so easy to swim in-and-between the different types of mediums so fluidly, yet at times, it feels like the ocean in between is too vast and wide!

 

References

Ong, W. (n.d.). Oral cultures and early writing . Youtube. www.youtube.com. https://www.youtube.com/watch?v=uvF30zFImuo&t=65s&ab_channel=AbeAboud

Wikipedia Contributers. (2022, May 15). Ghoti. Retrieved June 3, 2022, from https://en.wikipedia.org/wiki/Ghoti

 

Spam prevention powered by Akismet