Task 11:

I approached the exercise with the mindset of being an impartial judge. I first noticed the lack of information behind the prosecution’s recommendation. Did the defendants have prior convictions? Were they influenced by the risk assessment? What was the algorithm behind based on? On a side note, the use of colour was influential UX-wise;  a lot red/yellow was definitely eye-catching and I was curious is visuals play role when data is presented in real life, but I digress.

After reviewing two cases my plan was to base detention on the following criteria, in order of importance:

  • Severity of the crime (assault justified detention vs. less “dangerous” crimes such as drug possession. My concern was safety to the public.
  • Risk: Violence (I would detain red and yellow if paired with another violence indicator)
  • Risk: Flight, if red and paired with “dangerous” crime
  • Defendant’s comment and their age. I wanted to hear their voice and the age could possibly provide a clue about priors if paired with the type of crime. For example; a robbery charge over 30 year old might indicated that it is not the first one.
  • Prosecution’s comment.
  • Risk: Commit a crime

I recognized my own biases: gender and race. My studies in race inequality and systemic discrimination might lead me to be more lenient with people of colour and women. I also realized that my perceptions of “dangerous” were subjective, as were the indicators that I was using and how I was pairing them. With all this in mind, I tried to be as fair as possible and used this method for the next 5 cases.

This made me uneasy as without an actual points value I was pretty much just winging and feeling more unsure as I went along. I questioned whether I was trying to avoid detaining and/or deliberately opposing the prosecution’s recommendation and should reassess my method. Then I got hit with the news clipping and this changed everything. Suddenly my release “mistakes” began influencing me and I became less conservative with the DETAIN button; Risk of Flight and the prosecution’s recommendation moved to the top of my list. It reminded me of the Crime Machine podcast, as it was no longer about the cases themselves, but about MY Fear/Jail Capacity numbers, my reputation and the burden on the public.

For the last 10 defendants I shifted again and relied almost solely on the prosecution’s recommendation and the assessment despite my instincts. I notice my mindset changed and I found comfort in relinquishing my responsibilities and being backed by the data, the weight of the implications and outcomes was no longer just on my shoulder. What struck me is how easily I converted and how I was able to talk myself into it and justify it as “well they must know better”.

After the exercise I felt unsettled with the decisions I made and the consequences they might have if it was not a simulation. The two cases that stuck with me the most are that I released a mother, arrested for drug possession and woman arrested for reckless driving, mainly based on the prosecution’s recommendation and the risk assessment. Did I act on misinformation based on more misinformation? Were my biases the same ones that were used to create the assessment and also influence the prosecution. Additionally, my definition of “dangerous” to the public no longer made sense. Reckless driving was definitely dangerous for the public, and I can’t bare to overthink the drug possession arrest of a mother.

Dr. Cathy O’Neil defined “Weapons of Math Destruction” as having the following properties:

  • Widespread and important
  • Mysterious in their scoring mechanism
  • Destructive

She noted that recidivism algorithms are limited to particular types of data. For a real bail hearing risk assessment, the algorithm might only focus on a person’s record of arrests and convictions and their responses to a questionnaire about crime in their neighborhood and/or in their family, she called them “proxies for race and class”. Risk assessment scores are then used by judges to determine bail, sentences, parole etc. O’Neil believes that these score should be used to help defendants, not against them.

To challenge these type of assessment, ProPublica followed up on over 7000 people arrested in Broward County, Florida, in 2013 and 2014 to see how many were charged with new crimes over the next two years, as per their the COMPAS risk assessment. They found the assessment were unreliable and only 20 percent of the people predicted to commit violent crimes went on to do so. They also showed that these same risk assessments predicted that black defendants were twice as likely to commit future crime.

This was definitely a startling task. In 15 minutes, my idealistic convictions succumbed to the pressures of getting those cases out with infrastructure approval and a complete disregard of the impact on defendants and their dependants. And all this based on flawed and concealed information. The expression that Amanda Levendowski used in the Enron podcast to describe biases that are encoded now echoes in my mind: “Garbage in, garbage out.”

References:

McRaney, D. (n.d.). Machine Bias (rebroadcast). In You Are Not so Smart. Retrieved from https://soundcloud.com/youarenotsosmart/140-machine-bias-rebroadcast

O’Neil, C. (2017, July 16). How can we stop algorithms telling lies? The Observer. Retrieved from https://www.theguardian.com/technology/2017/jul/16/how-can-we-stop-algorithms-telling-lies

Garbage in, garbage out (computer science). (2021, October 7). Wikipedia. https://en.wikipedia.org/wiki/Garbage_in,_garbage_out

ProPublica, Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine Bias. ProPublica; ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

Addams, S (2001). ‘Bad Data’. In United Features Syndicate (Ed.), www.dillbert.com  

Task 10: Attention Economy

I initially approached the game noticing techniques from the Dark Patterns reading in mind, as well as UI faux-pas:

  • Misleading use of generally recognized and intuitive symbols and colours.
    • Underlines and different colour text that are not links
    • Use of green instead of red for negative affirmations
  • Confusing wording (must, should, can, etc)
  • Double negatives (if you do not…)
  • Hiding command that would expedite the process (deselect/select)
  • Disturbing pop ups with useless commands. (Timer and Help, “close” looks like trademark, the # of people ahead of me starts at 430 and increases by one when I click “Help”)
  • Conflicting directives (Upload/Download in the same area)
  • Confusing layout (selection boxes are inconsistently above or below the image)
  • Forcing me to disclose personal information and upload a photo
    •  Including making it mandatory to select an option that I don’t like
  • Inconsistent, inoperable and time consuming data entry fields (Age vs. birthday, Mrs. vs. Gender, door number picker)
  • Forcing me to accept their cookies and spurious Terms & Conditions

I was moving at a good pace until the image identification pages. That’s when my approach went from analytical to game mode-big mistake!  I started playing like I would and escape room, trying to use logic when selecting the images. Dissecting semantics, clicking everywhere looking for hidden links. I tried so many combinations to get past those four repeating screens and failed:

    • Glasses plural, therefore a sheet of pane is just glass
    • “a” vs. no preceding articles.
    • Not selecting items that had other objects in the image.
    • Not to mention a deep dive into the existential meaning of “light” (that almost broke me). 

All throughout, the timer and moving 1-2-3-4 were a constant distraction, almost antagonizing me. I finally vowed to quit at the 30-minute mark and I did. Five minutes into writing my experience, I went back to the game and tried one more option; selecting all the images on every set and voila. I am still not sure if that’s what actually worked or if the game was playing me and was going to let me pass regardless after a certain number of attempts.  To be honest, I don’t care to find out if it means one more second on that site.

Well-played BAGAAR, 37 minutes of absolute frustration achieved!

 

Task 9: Network (Golden Record Quiz)

The Golden Record data in Palladio was not initially clear to me, I first needed to learn how to decipher the data by reading other Palladio user guides.

I played with the facets (curator vs. track. vs community) to generate tables and graphs that would provide me an overview of the most selected songs to find connections. This did not add more insight as I was lacking any context behind song selection. I then arranged the data to display weighted track nodes (biggest on the bottom) and compare my selection with that of my peers.

I thought I could find a pattern or a link to more than one song. Without intel on why curators chose the songs, the idea was that perhaps I could identify those who used the same criteria as I had by the degree of connectivity of tracks. However, that would presume that their rationale was the same as mine and I had no corroborating information to support this. I based my track selection on feeling, itself intangible and immeasurable. I admitted that my selection method was flawed as it was solely based on my perception and had no qualitative data.

The main problem with the Golden Record Palladio data is that other than seeing node and edge information, and the limited ability to play with facets, the data revealed “who” (curators) and “what” (track selection) but not “why” (criteria). In the real world, this small sample could not be considered representational as it lacks values that would permit a proper analysis. Even if used as predictive model, it would fail in its primary task, which was to chose songs that can, as much as possible, encapsulate the human race.

Two strands of thinking tie together here. One is that the algorithm creators (code writers), even if they strive for inclusiveness, objectivity and neutrality, build into their creations their own perspectives and values. The other is that the datasets to which algorithms are applied have their own limits and deficiencies. Even datasets with billions of pieces of information do not capture the fullness of people’s lives and the diversity of their experiences. Moreover, the datasets themselves are imperfect because they do not contain inputs from everyone or a representative sample of everyone… creating a flawed, logic-driven society and that as the process evolves – that is, as algorithms begin to write the algorithms –(Rainie and Anderson, 2017)

This implies that one cannot justify or prove that the song selection is inclusive of all cultural, socio-economic, political factors, or even if those factors played part. Further, assumptions and links to song choice cannot be made without a clear profile of the curators themselves because biases cannot be identified.

I was intrigued by the possibility of adding more data into Palladio, integrating values like curator gender, class, etc. Additionally, the Golden Record Curation Data Gathering “Quiz” might include multiple choice questions explaining song selection as facets. For example:

Which of the following did you mostly base your song selection on:

      1.  Personal preference
      2.  Cultural representation  
      3. Song popularity
      4. Other.

Even then, these values would be biased by my own thought process and what I assume are plausible choices.

If algorithmic bias is merely a data problem, the often-touted solution is to de-bias the data pipeline. However, data “fixes” such as re-sampling or re-weighting the training distribution are costly and hinge on (1) knowing a priori what sensitive features are responsible for the undesirable bias and (2) having comprehensive labels for protected attributes and all proxy variables. (Hooker, 2021)

The original tracks for the Golden Record where chosen by renowned astrophysicist Carl Sagan and his team of first world, successful and educated astronomers, sound engineers, musicologists, record executives, journalists and artists. Were they qualified to make such an important decision on behalf of mankind? Probably. However, much like the Palladio data, it was very exclusive from the start.

 

References

Conroy, M. (2021). Networks, Maps, and Time: Visualizing Historical Networks Using Palladio. Digital Humanities Quarterly015(1). http://www.digitalhumanities.org/dhq/vol/15/1/000534/000534.html

Rainie, L., & Anderson, J. (2017, February 8). Code-Dependent: Pros and Cons of the Algorithm Age. Pew Research Center: Internet, Science & Tech; Pew Research Center: Internet, Science & Tech. https://www.pewresearch.org/internet/2017/02/08/code-dependent-pros-and-cons-of-the-algorithm-age/

Hooker, S. (2021). Moving beyond “algorithmic bias is a data problem.” Patterns2(4), 100241. https://doi.org/10.1016/j.patter.2021.100241

‌ Wikipedia Contributors. (2018, November 18). Voyager Golden Record. Wikipedia; Wikimedia Foundation. https://en.wikipedia.org/wiki/Voyager_Golden_Record

Spam prevention powered by Akismet