Upstream Bias in Educational Data: Lessons from the Golden Record Exercise
The rapid advancement of AI represents one of the most significant developments in writing and reading technology in recent history. Just as the printing press revolutionized access to written knowledge in the 15th century, AI technologies are fundamentally transforming how information is created, accessed, and interpreted in educational settings. The United Nations Human Rights Council (2024) notes that “Generative artificial intelligence systems are newer forms of artificial intelligence… designed to produce diverse outputs on the basis of extensive training data sets, neural networks, deep learning architecture and user prompts” (para. 11), enabling the creation of text, images, and other content that was previously only possible through human effort.
It’s tempting to look only at the upside of this new technology, and how it addresses needs for speed, scale and personalization, but this would be to overlook the challenges it poses to equity and representation. As noted by the UN Special Rapporteur, “technology is never neutral – it reflects the values and interests of those who influence its design and use, and is fundamentally shaped by the same structures of inequality that operate in society” (UN Human Rights Council, 2024, para. 7). My experience with the Golden Record visualization exercise provides a revealing case study of how modern writing/reading technology systems can inadvertently perpetuate patterns of exclusion, even with seemingly neutral parameters.
The Power Dynamics of Data Curation
An experience I had in this course enabled me to see how data collection, under the guise of neutrality, can actually reinforce existing patterns of exclusion. What seemed like a simple engagement with a survey and the resulting visualization revealed deeper concerns about who gets represented in data systems.
The seemingly neutral, objective process of collecting and categorizing data contains inherent power dynamics that determine which perspectives are validated and which are excluded. For the Golden Record assignment, I listened to the Podcast episode and picked 10 tracks that were on the Golden Record, but not 10 ‘musical tracks.’ Among my 10 picks were “United Nations Greetings / Whale Songs” and “Sounds of Earth – Voyager Golden Record,” which were listed on the podcast page (the instructions did tell us to listen to the podcast, which I did). But I did not go to the hyperlinked Youtube page that had the 27 tracks from which we were supposed to make our selections. Moreover, the names of the tracks on the podcast page were different from those on the Youtube page.
My 2 non-music tracks were not represented on the survey, and I wasn’t sure if I was picking the right songs because some of them had different names than on the Youtube page. I didn’t want to spend all that time cross-referencing the songs on the podcast page with the songs on the Youtube page. I had already made my choices and submitted my assignment, but when it came time to do the task, I could choose a maximum of 8 because 2 of my choices were excluded, and one got lost in the shuffle—I couldn’t identify it among the Youtube names.
But then again, was it my problem, or was it an inflexible system that couldn’t accommodate different interpretations? This small-scale experience mirrors a much larger problem in AI systems: what doesn’t get counted in the data simply doesn’t exist in the resulting model. Just as my legitimate non-musical selections disappeared from the visualization because they didn’t fit predetermined categories, entire perspectives and knowledge systems from marginalized communities are systematically excluded from AI training data. The survey couldn’t accommodate my non-standard inputs—similarly, AI systems trained on limited or biased datasets cannot represent knowledge they’ve never been exposed to. This isn’t just about personal inconvenience; it’s about who gets to contribute to our collective digital knowledge and whose perspectives are systematically erased.
My experience with the Golden Record visualization reflects what Reiss (2021) identifies as the power dynamics embedded in digital data systems. The survey functioned as a tracking system that collected specific inputs but excluded others – it literally couldn’t account for my non-musical selections, even though they were legitimate parts of the Golden Record. This illuminates how data collection systems make assumptions about what “counts” as valid input (musical tracks only in this case), enforcing a kind of standardization that flattens innovation and variation. As Reiss explains, such digital systems—including AI in education—evolved to address legitimate challenges like providing individualized feedback at scale, yet they simultaneously reinforce specific modes of knowledge at the expense of others. In my case, the system’s inflexibility to accommodate legitimate diversity in interpretation meant that my non-musical selections became what the assignment prompt describes as “data that is missing, assumed, or misinterpreted” (assignment instructions).
In the data visualization that the survey results produced, I was underrepresented because most people had 10 choices and I only had 7. This small instance illustrates what Baker and Hawn (2022) describe as “representational bias” in educational data systems, though they examine this phenomenon on a much more consequential scale. Their research into AI systems like automated essay scoring reveals how these technologies may systematically disadvantage certain student populations. Their analysis demonstrates the critical importance of addressing bias as these tools increasingly determine how students’ writing is evaluated and valued.
The UN Special Rapporteur’s report highlights how data collection bias operates: “If particular groups are over- or underrepresented in the training sets, including along racial and ethnic lines, algorithmic bias can result” (UN Human Rights Council, 2024, para. 13). This upstream exclusion represents a form of systemic discrimination that occurs before an algorithm ever makes its first prediction or assessment. This upstream perspective is particularly valuable because it redirects attention to earlier stages in the AI development process. While much discourse on AI bias focuses on downstream effects—the outputs and predictions an algorithm produces—the more fundamental issue is that certain possibilities are excluded from consideration before the algorithm ever begins its work. When the inputs themselves aren’t collected or recognized as valid, no amount of algorithmic fine-tuning can recover these missing perspectives. This form of invisibility is especially pernicious because those affected may never know they’ve been disadvantaged, as the system presents itself as comprehensive despite significant blind spots in its foundational data.
Educational Consequences of Data Collection Bias
According to TeachFlow (2023), AI tools are changing education by giving students personalized learning, helping with paperwork, and providing quick feedback on their work. While these tools offer new ways to teach reading and writing, there are important concerns about how these systems might contain bias in their data and programming.
In educational settings, the consequences of biased data collection manifest in several concerning ways. Predictive analytics tools deployed in schools “often rate racial minorities as less likely to succeed academically and in their careers, because of algorithm design and data choices” (UN Human Rights Council, 2024, para. 44). The report further notes that educators may subsequently “steer students from marginalized racial and ethnic groups away from educational and career choices that would maximize their potential and offer the best opportunities to break cycles of exclusion” (para. 44).
When schools adopt AI systems built on incomplete or unrepresentative data, I’m concerned they do more than just maintain the status quo of inequality. These systems can actually intensify biases and make disparities worse for students from underrepresented groups. A single assessment algorithm can affect thousands of students’ educational trajectories, with life-altering consequences for individuals from marginalized communities.
The Challenge of Algorithmic Opacity
Compounding the issue is what the UN Special Rapporteur terms the “‘black box’ problem” (UN Human Rights Council, 2024, para. 21) of AI systems. As algorithms incorporate new patterns from the data they process, “individuals relying on the algorithm may no longer be able to ‘look under the hood’ and pinpoint the criteria that the algorithm has used to produce certain outcomes” (para. 21). As the UN report emphasizes, this opacity creates situations where discrimination becomes “increasingly difficult to identify, prove and remedy” (UN Human Rights Council, 2024, para. 21), particularly when algorithmic systems leave minimal evidence of how decisions were reached.
For students and educators alike, this creates a troubling scenario where consequential decisions are determined by systems whose reasoning processes remain inscrutable. Perhaps the most concerning aspect is that students themselves may never know why they were classified a certain way.
We created AI writing systems to meet our need for efficiency and scale, but these inventions are now shaping literacy practices in ways that can amplify existing inequalities. My Golden Record experience offered just a glimpse of what happens when systems exclude certain inputs – imagine this same pattern playing out in AI writing tools evaluating millions of students’ work, potentially misrepresenting entire communities through skewed algorithmic judgments.
Solutions/Mitigation Strategies
Addressing bias in educational data systems requires deliberate intervention at several levels. Baker and Hawn (2022) advocate for “representation-focused validation approaches” that examine whether algorithmic systems produce equivalent outcomes across different groups. While much of the discourse on AI bias focuses on technical solutions, the human element remains crucial in mitigating these issues. According to Teachflow (2023), “providing professional development opportunities for educators is crucial to ensure they are equipped with the necessary skills and knowledge to effectively integrate AI tools into their teaching practices.” When educators develop algorithmic literacy – the ability to critically assess AI systems’ capabilities, limitations, and potential biases – they can design learning experiences that reveal these issues to students. Indeed, my Golden Record experience appears deliberately structured to illustrate data collection limitations – creating a pedagogical moment that exposed how systems might exclude valid inputs. This instructional approach mirrors what Teachflow advocates: training that extends beyond technical skills to include critical examination of how data collection methods may encode existing inequalities. By designing experiences that allow students to personally encounter upstream bias in data collection, educators can cultivate the critical awareness needed to identify and address these issues in consequential real-world AI applications.
Addressing these challenges requires fundamental reconsideration of how educational data is collected, processed, and deployed. The UN Special Rapporteur recommends that stakeholders “ensure meaningful and effective consultation with those from marginalized racial and ethnic groups, professionals from relevant societal domains and those with expertise in systemic racism in the design, development and deployment of artificial intelligence products” (UN Human Rights Council, 2024, para. 69).
Additionally, the report calls for “human rights due diligence assessments at all stages of artificial intelligence product design, development and deployment” (para. 69), emphasizing that upstream interventions are essential to preventing downstream harms. The core challenge involves questioning basic assumptions about data collection practices rather than merely addressing algorithmic outputs after biased data has already shaped the system.
References
Baker, R. S., & Hawn, A. (2022). Algorithmic bias in education. International Journal of Artificial Intelligence in Education, 32, 1052–1092. https://doi.org/10.1007/s40593-021-00285-9
Reiss, M. J. (2021). The use of AI in education: Practicalities and ethical considerations. London Review of Education, 19(1), 1–14. https://doi.org/10.14324/LRE.19.1.05
Teachflow. (2023, June 30). Assessing the reliability and bias of AI in education. https://teachflow.ai/assessing-the-reliability-and-bias-of-ai-in-education/
United Nations Human Rights Council. (2024). Contemporary forms of racism, racial discrimination, xenophobia and related intolerance (A/HRC/56/68). https://documents-dds-ny.un.org/doc/UNDOC/GEN/G22/336/38/PDF/G2233638.pdf