Why Do We Grade?

I hate grading. I happily give feedback to students, but I hate grading. This is especially true with the (false) precision of the 100-point grade scale here at UBC (in contrast to my undergraduate experience with the somewhat coarser grades of A, A/B, B, B/C, etc.).

Do I at least think that a student who receives an 87 on my exam has better achieved the learning objectives than a student with an 83?

No.

But am I confident that if I rewrote the exam and they took it again, they would get those same scores?

Again, no.

But surely, I am at least certain that if I regraded those exams, the scores wouldn’t change.

Still no.

When I was a student, I viewed grades as objective measures of my learning. Now that I’m the one giving grades, they feel especially arbitrary. More alarming though, I see firsthand how grades warp students’ motivations in a manner that actively impedes their learning.

[Quick terminology note: Since I’m an American at a Canadian university, I’ll clarify that I’m using the American English meaning of grading, which I believe encompasses the Canadian English usage of both grading and marking. Also, I write (prepare) an exam, but my students take the exam.]

So why do we grade?

I think the main reason we grade is that it doesn’t occur to us to do otherwise. Universities have been giving letter grades for around 100 years, and most of us have received letter grades throughout our education. It is simply the default.

At the institutional level, practices have grown dependent on grades. We use grades to rank and sort students for scholarships and awards. Accrediting agencies may expect specific grading practices. Prospective employers and graduate admissions committees use grades as a measure of students’ skills and abilities.  

At the course level, we justify grading in a few ways. Grades are presented as indicators of achievement, allowing comparisons among students. Nominally, we also consider grades to be feedback about student performance. More practically, instructors use grades (and the weighting of individual assignments and exams) to communicate the relative importance of tasks and to motivate student performance or effort.

What I dislike about grading

Arbitrariness

While the opening of this essay is flippant, it conveys the arbitrary nature of grades. For a given student, their grade will be influenced by my views on what is important in our discipline, how I choose assess, and even my mood while I am grading. Even if I were perfectly consistent, learning is incredibly complex. I simply cannot adequately or reliably characterize a student’s learning with a number or letter. In a well-designed course, grades may give a reasonable approximation of how well a student achieved our (arbitrary) learning objectives, but never with the accuracy and precision implied by a 100-point scale.

Difficulty recognizing improvement

Learning is a continuous process of improvement. However, we typically calculate final grades as a weighted average of all grades in a course. This arbitrarily rewards consistency and makes it difficult to recognize improvement. I teach a writing-intensive course where students develop a research proposal in stages over the term, with each stage an individual assignment. As a thought exercise, assume the assignments account for 66.7% of the grade and the final research proposal 33.3%. Imagine three students:

  • “Strong Student” is already excellent at scientific writing. The TAs and I give targeted feedback for improvement, but it is ignored because the student is happy with their existing level of performance.
    • Grade – Assignments: 90%, Final Proposal: 90%, Final Grade: 90%.
  • “The Fader” also already has excellent writing skills. They start the term strongly but ignore feedback and their effort steadily decreases over time.
    • Grade – Assignments: 77%, Final Proposal: 65%, Final Grade: 73%.
  • “The Learner” has no experience with scientific writing. Their early assignments score very low, but they carefully incorporate feedback and steadily improve their abilities with each assignment. They produce an excellent final proposal.
    • Grade – Assignments: 65%, Final Proposal: 90%, Final Grade: 73%.

Do those grades represent the students’ learning? No. The Strong Student and The Fader learned little (though the Strong Student at least put in consistent effort). The Learner clearly gained the most in the course, and in my opinion deserves the highest grade among the three students. I could devise a system to somehow weight improvement, but that would be complicated and risks being perceived as unfair by students.

Discouraging curiosity and risk taking

Innate interest in learning is expressed through curiosity. Grades force students to focus on what we as instructors think is most important, rather than what the students find interesting or exciting. This kills curiosity.

Failure is a core part of learning. If we never fail, we probably aren’t undertaking worthwhile learning, we are just playing it safe with something we can easily achieve. With grades averaged over a course, there is a significant penalty for students taking a risk (with a project or assignment) and coming up short. This encourages them to constrain their creativity and ideas and instead pursue precisely what they think the instructor wants to see.

Warping motivation

Humans have innate motivation to learn. When I watch my kids play, I see their actions driven by nothing more than curiosity. They want to understand how something works or to observe the results of certain actions. They are experimenting and learning, without grades.

Based on the work of Noel Entwistle (1988) and others, student approaches to learning may be classified as surface, deep, or strategic.

  • A surface learner focuses just on the information necessary to complete a task, often memorizing disjoint facts or procedures without regard to context or connections.  
  • A deep learner fully engages with the course, making connections among topics and working to thoughtfully integrate new information with their existing knowledge. Deep learners ask insightful questions and connect ideas to other courses and their personal experience. They show clear interest in learning about the topic, even if they won’t be tested on a particular aspect.
  • A strategic learner works to simultaneously maximize their grade and minimize their effort. This student might ask, “Will this be on the test?”, and then immediately tune out if the answer is “no.” Strategic learners are assessment-focused and are disinterested in work that does not directly influence their grade.

We obviously want our students to be deep learners, but grades lead to strategic learners. When an intense focus on grades (especially for high-achieving students) combines with limited time and multiple classes, strategic learning is unfortunately a very rational strategy.

Cheating is the most self-defeating approach to learning and often an act of desperation under pressure. Misconduct isn’t limited to a classroom context (as perennial data manipulation and plagiarism scandals show), but a student who cheats is telling us (in that instance, at least) that grades possess much greater value than learning.

Limited (and ignored) feedback

Theoretically, grades are feedback. In practice though, can a single number or letter give actionable feedback about a student’s strengths and areas for improvement? No. What is the learning value then if there is no guidance for improvement?

Most of us give some form of supplemental feedback in addition to grades. For example, we could explain why points were deducted on a test question, or we give suggestions for revising an essay. While this approach seems reasonable, it isn’t as helpful as it appears. In the late 1980s, Ruth Butler began testing the effects of students receiving numerical grades alone, written feedback alone, or a numerical grade with written feedback. Her studies, and others since have found that written feedback alone (without a grade) best maintains student interest and motivation and facilitates improvement (Butler, 1988). When we give students a grade and feedback, the feedback is often ignored.

No self-evaluation

Traditional grading often precludes student self-reflection and self-evaluation. We aim to prepare students for their careers and the “real world”. Although I do receive some external feedback in my career (e.g., teaching evaluations, peer review comments), most of the evaluation and feedback is from me. I re-read and revise everything I write, even e-mails; At the end of each term, I reflect on my courses and write down ideas for improvement.

Are we teaching our students how to critically evaluate their own work? Or are we simply hoping that they read through their paper at least once before they click “submit”?

Conclusion

The longer I teach, the greater my distaste for grades becomes. I’m certainly not the first person with this thought. Alfie Kohn has been promoting these ideas for years (e.g., Kohn, 2011). In higher education, a rebellion against traditional grading has been gaining traction under the umbrella of “Ungrading.” A recently published book, Ungrading: Why Rating Students Undermines Learning (and What to do instead), edited by Susan D. Blum (2020), has spurred a number of Twitter and hallway discussions.

Because of institutional constraints, it is rarely feasible to dispense with grades altogether. However a number of thoughtful instructors and teachers have been developing alternative approaches to traditional grading. I have been on this journey myself, ungrading my first course in Spring 2020. Over the next several posts, I will explain some approaches to ungrading and reflect on my own classroom experience.

References

Blum, S. D. (Ed.). (2020). Ungrading: Why rating students undermines learning (and what to do instead). West Virginia University Press.

Butler, R. (1988). Enhancing and Undermining Intrinsic Motivation: The Effects of Task-Involving and Ego-Involving Evaluation on Interest and Performance. British Journal of Educational Psychology, 58(1), 1–14.

Entwistle, N. (1988). Motivation and learning strategies. Educational & Child Psychology, 5(3), 5–20.

Kohn, A. (2011). The Case against Grades. Educational Leadership, 69(3), 28–33.

One comment

  1. I think self-evaluation is very meaningful, which is a helpful way to evaluate how much we grasp. As for a student, the motivation of learning is getting a high score, helpful for my graduate application. However, it does not mean I really learned a lot of valuable things through the course. To a great extent, interest is our best teacher.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.