{"id":352,"date":"2013-01-23T21:32:00","date_gmt":"2013-01-24T04:32:00","guid":{"rendered":"https:\/\/blogs.ubc.ca\/chendricks\/?p=352"},"modified":"2013-04-21T17:52:59","modified_gmt":"2013-04-22T00:52:59","slug":"problems-with-grading-rubrics-for-complex-assignments","status":"publish","type":"post","link":"https:\/\/blogs.ubc.ca\/chendricks\/2013\/01\/23\/problems-with-grading-rubrics-for-complex-assignments\/","title":{"rendered":"Problems with grading rubrics for complex assignments"},"content":{"rendered":"<p><span style=\"color: #333333;\">In<\/span> <a title=\"The value of peer review for effective feedback\" href=\"https:\/\/blogs.ubc.ca\/chendricks\/2012\/11\/01\/the-value-of-peer-review-for-effective-feedback\/\" target=\"_blank\"><span style=\"text-decoration: underline;\">an earlier post<\/span><\/a> <span style=\"color: #333333;\">I discussed a paper by D. Royce Sadler on how peer marking could be a means for students to learn how to become better assessors themselves, of their own and others&#8217; work. This could not only allow them to become more self-regulated learners, but also fulfill roles outside of the university in which they will need to evaluate the work of others. In that essay Sadler argues against giving students preset marking criteria to use to evaluate their own work or that of other students (when that work is complex, such as an essay), because:<\/span><\/p>\n<ol>\n<li><span style=\"color: #333333;\">&#8220;Quality&#8221; is more of a global concept that can&#8217;t easily be captured by a set of criteria, as it often includes things that can&#8217;t be easily articulated.<\/span><\/li>\n<li><span style=\"color: #333333;\">As Sadler pointed out in a comment to the post noted above, having a set of criteria in advance predisposes students to look for only those things, and yet in any particular complex work there may be other things that are relevant for judging quality.<\/span><\/li>\n<li><span style=\"color: #333333;\">Giving students criteria in advance doesn&#8217;t prepare them for life beyond their university courses, where they won&#8217;t often have such criteria.<\/span><\/li>\n<\/ol>\n<p><span style=\"color: #333333;\">I was skeptical about asking students to evaluate each others&#8217; work without any criteria to go on, so I decided to read another one of his articles in which this point is argued for more extensively.<\/span><\/p>\n<p><span style=\"color: #333333;\">Here I&#8217;ll give a summary of Sadler&#8217;s book chapter entitled<\/span> <span style=\"color: #800000;\"><strong>&#8220;Transforming Holistic Assessment and Grading into a Vehicle for Complex Learning&#8221;<\/strong> <\/span><span style=\"color: #333333;\">(in <em>Assessment, Learning and Judgement in Higher Education<\/em>, Ed. G. Joughin. Dordrecht: Springer, 2009).<\/span> <a title=\"link to Sadler's &quot;Transforming Holistic Assessment and Grading&quot; article\" href=\"http:\/\/link.springer.com\/chapter\/10.1007%2F978-1-4020-8905-3_4\" target=\"_blank\"><span style=\"text-decoration: underline;\">DOI: 10.1007\/978-1-4020-8905-3_4)<\/span>.<\/a><\/p>\n<p><span style=\"color: #000000;\"><strong>[Update April 22, 2013] <\/strong>Since the above is behind a paywall, I am attaching here a short article by Sadler <\/span><span style=\"color: #000000;\">that<\/span> <span style=\"color: #000000;\">discusses similar points, and that<\/span><span style=\"color: #000000;\"> I&#8217;ve gotten permission to post (by both Sadler and the publisher):<\/span> <span style=\"text-decoration: underline;\"><a href=\"https:\/\/blogs.ubc.ca\/chendricks\/files\/2013\/01\/Sadler2009-AreWeShortchangingOurStudents.pdf\" target=\"_blank\">Are we short-changing our students? The use of present criteria in assessment<\/a><\/span><span style=\"color: #000000;\">. <em>TLA Interchange<\/em> 3 (Spring 2009): 1-8. This was a publication from what is now the<\/span> <span style=\"text-decoration: underline;\"><a href=\"http:\/\/www.ed.ac.uk\/iad\" target=\"_blank\">Institute for Academic Development<\/a><\/span> <span style=\"color: #000000;\">at the University of Edinburgh, but these newsletters are no longer online. <\/span><\/p>\n<p><span style=\"color: #333333;\"><em><strong>Note: this is a long post!<\/strong> That&#8217;s because it&#8217;s a complicated article, and I want to ensure that I&#8217;ve got all the arguments down before commenting.<\/em><\/span><\/p>\n<p><span style=\"color: #333333;\"><!--more--><\/span><\/p>\n<p><span style=\"color: #333333;\">Sadler distinguishes between two kinds of assessment: <strong>analytic grading<\/strong> and <strong>holistic grading<\/strong>. One of the main arguments of the essay is that analytic grading has significant problems when used for certain kinds of assignments, enough to suggest we should not be using it in those contexts. The other part of the argument is that we should be using peer assessment to help students learn how to use holistic methods in evaluating their own and others&#8217; works.<\/span><\/p>\n<div title=\"Page 1\">\n<div>\n<div>\n<p><span style=\"color: #333333;\">The kinds of assignments Sadler is focused on, the ones where analytic grading is problematic, are &#8220;divergent&#8221; tasks: these could have multiple responses that are quite different but still of high quality, and they &#8220;provide opportunities for learners to demonstrate sophisticated cognitive abilities, integration of knowledge, complex problem solving, critical reasoning, original thinking, and innovation&#8221; (47). Those are precisely the kind of assignments I often give in both Philosophy and Arts One courses, when I ask students to write essays.<\/span><\/p>\n<p><span style=\"color: #800000;\"><strong>Analytic and holistic grading<\/strong><\/span><\/p>\n<p><span style=\"color: #333333;\">One engages in <strong>analytic grading<\/strong> when one evaluates work using separate judgments on various criteria (whether given by the instructor, negotiated with students, or devised by students themselves). The judgments on each criterion are &#8220;combined using a rule or formula, and converted to a grade&#8221; (45). Clearly this would be the sort of thing one does when using a rubric that has points attached to each part of the rubric and in which the final grade is determined by adding up the points.<\/span><\/p>\n<p><span style=\"color: #333333;\">On a personal note, I have resisted going this route. I have used rubrics extensively, but mainly for the purposes of providing students knowledge in advance of the sorts of things they need to try to put into their essays, and to enable me to organize my comments so they can see which sorts of things they need to work on most (given the prevalence of comments in each category). I have also used rubrics as a check to help with fairness&#8211;it helps me make sure I don&#8217;t overlook one category in someone&#8217;s paper, while focusing on it in another. I feel like it helps me be more consistent.<\/span><\/p>\n<p><span style=\"color: #333333;\">However, I have refused to go the route of assigning marks or points to each category and adding up a grade that way. In fact, I have explicitly said on my rubrics that students are not to think of the rubrics and categories as providing some formula out of which they could or I could calculate a grade. I have said that marking essays is too complicated for that sort of thing.<\/span><\/p>\n<p><span style=\"color: #333333;\">For reference, and in case anyone is interested,\u00a0<\/span><span style=\"color: #333333;\"><strong>here is the latest iteration of the grading rubric I use for philosophy essays:<\/strong><\/span> <a href=\"https:\/\/blogs.ubc.ca\/chendricks\/files\/2013\/01\/PprRubricLtrs-S12.pdf\"><strong><\/strong><span style=\"text-decoration: underline;\">HendricksMarkingRubric-Jan2012<\/span><\/a><\/p>\n<p><span style=\"color: #333333;\">Sadler notes later in the essay, however, that analytic grading could also take place using a rubric without specific points or weights assigned, where an assessor picks a single &#8220;cell&#8221; in the rubric for each criterion or standard that best fits the work (52). That isn&#8217;t quite what I do, either. I actually tie each of my comments, as much as possible, to one of the &#8220;cells&#8221; in the rubric, so as to say, e.g., here the essay is doing something in the B-range for &#8220;structure.&#8221; But I don&#8217;t assign a single mark or cell for each criterion to the essay.<\/span><\/p>\n<p><span style=\"color: #333333;\">\u00a0<\/span><\/p>\n<p><span style=\"color: #333333;\"><strong>Holistic grading<\/strong>, on the other hand, occurs when an instructor judges a work as a whole and provides a &#8220;global judgment.&#8221;<\/span><\/p>\n<div title=\"Page 2\">\n<div>\n<div>\n<blockquote><p><span style=\"color: #333333;\">Although the teacher may note specific features that stand out while appraising, arriving directly at a global judgment is foremost. Reflection on that judgment gives rise to an explanation, which necessarily refers to criteria. (46)<\/span><\/p><\/blockquote>\n<p><span style=\"color: #333333;\">In holistic grading, then, the criteria come afterwards, as it were, when one explains to oneself and the student the judgment made. As Sadler puts it, holistic grading can be characterized as &#8220;impressionistic or intuitive&#8221; (46). To summarize the difference between holistic and analytic grading, Sadler says:<\/span><\/p>\n<div title=\"Page 4\">\n<div>\n<div>\n<blockquote><p><span style=\"color: #333333;\">Holistic grading involves appraising student works as integrated entities; analytic grading requires criterion-by-criterion judgments. (48)<\/span><\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n<p><span style=\"color: #333333;\">Reflecting on my own practice again, using &#8220;impressionistic or intuitive&#8221; judgments is what I used to do before using rubrics. Or rather, I still do it while using rubrics, but less. I would read an essay, give comments, and at the end find myself thinking that the essay as a whole deserved a certain grade. I was uncomfortable with this, though&#8211;where was that judgment coming from? Now I still do that sort of thing, but check it with the rubric&#8211;how many aspects of the essay are in the &#8220;A&#8221; range according to the rubric, how many in the &#8220;B&#8221; range, etc., and does this roughly correspond to the grade I&#8217;ve just impressionistically determined? This isn&#8217;t a formulaic sort of activity, as I don&#8217;t actually count and add, but it serves as a kind of check for me to make sure I&#8217;ve thought about all aspects of the essay (or rather, at least those on the rubric) before coming up with a grade.<\/span><\/p>\n<p><span style=\"color: #333333;\">Sadler points out later in the essay that there isn&#8217;t any reason to be uncomfortable with impressionistic judgments. This sort of holistic process is &#8220;rational, normal and professional&#8221; (59), as it is how judgment of complex works does and must work. Of course, one must have significant experience of various kinds of work in a genre, and works of various quality, to be able to come to such judgments well, as an expert. More on this below.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #800000;\"><strong>The supposed value of analytic grading<\/strong><\/span><\/p>\n<p><span style=\"color: #333333;\">Sadler notes that analytic grading schemes have gained in popularity, and they &#8220;introduce formal structure into the grading process, ostensibly to make it more objective and thus reduce the likelihood of favouritism or arbitrariness&#8221; (48). He lists the various aspects of the rationale many have for analytic grading, including improving consistency and objectivity, making the grading process transparent to students, encouraging students &#8220;to attend to the assessment criteria during development of their work,&#8221; providing feedback &#8220;more efficiently, with less need for the teacher to write extensive comments&#8221; (50-51).<\/span><\/p>\n<p><span style=\"color: #333333;\">These reflect why I moved to using grading rubrics, except that I&#8217;d add: helping students see what they need to improve. Students can get lost in comments, so having a rubric organizes feedback and pinpoints certain things they need to do next time (e.g., be sure to have an introduction to your essay with a clear thesis statement).<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #800000;\"><strong>The problem with analytic grading<\/strong><\/span><\/p>\n<p><span style=\"color: #333333;\">Despite the supposed benefits listed above, Sadler argues that analytic grading schemes &#8220;can, and for some student works do, lead to deficient or distorted grading decisions,&#8221; or\u00a0<strong>grading anomalies<\/strong> (51). He focuses on two such anomalies<\/span><\/p>\n<p><span style=\"color: #333333;\">I&#8217;ll combine the two anomalies somewhat here; both have to do with a mismatch between what an instructor thinks of a work globally and what sort of judgment would be suggested by using an analytic grading method. This is, actually, the first anomaly: e.g., it can be the case that when one finishes reading an essay (for example), one has the sense that it is a truly excellent one, but using the rubric shows that the essay falls short in a number of ways and therefore wouldn&#8217;t appear so excellent using the rubric alone. The opposite can, of course, occur as well.<\/span><\/p>\n<p><span style=\"color: #333333;\">The second problem can occur when one finds that the above issue is due to a criterion being missing from one&#8217;s list. This seems like it could be easy to fix, right? Just add a new criterion to the rubric. But to do so and judge that work on the new criterion is problematic: it &#8220;would breach the implicit contract between teacher and student that only specified criteria will be used&#8221; (54).<\/span><\/p>\n<p><span style=\"color: #333333;\">These problems occur for several reasons:<\/span><\/p>\n<ul>\n<li><span style=\"color: #333333;\">There may be a significant amount of knowledge that goes beyond what can be expressed in words (here he cites Polyani, 1962) (53).<\/span><\/li>\n<li><span style=\"color: #333333;\">Experts may process information to come up with judgments in complex assessment scenarios in ways that &#8220;do not necessarily map neatly onto explicit sets of specified criteria, or simple rules for combination&#8221; (here he cites Sadler, 1981) (53).<\/span><\/li>\n<li><span style=\"color: #333333;\">When specifying a set of criteria for assessing certain kinds of works, one has to choose from a larger set&#8211;there are many, many criteria that could be used for each kind of work, and to use them all would be unwieldy (if one could even specify them all, which might not be possible (54).<\/span><\/li>\n<\/ul>\n<p><span style=\"color: #333333;\">I have experienced both of these problems, and have done what Sadler says some instructors do as a response to the first problem: trust the holistic impression and fudge the use of the rubric to fit the former. For the second problem, my response has been to simply note the reasons for the holistic judgment in separate comments on the essay, rather than relying on the rubric alone. This works, because I have explicitly stated on the rubric that it is not to be used to mechanically determine a mark, and that it can&#8217;t possibly cover all aspects of judgments on quality (my rubric states at the top, among other things: &#8220;Note that the statements below are not exhaustive for what may occur in each category, but serve as common examples&#8221;).<\/span><\/p>\n<p><span style=\"text-decoration: underline; color: #333333;\">The irony of analytic grading<\/span><\/p>\n<p><span style=\"color: #333333;\">Sadler notes that analytic grading schemes are often used to make the grading process more transparent, yet the anomalies above are often hidden from students, so they get the impression they are getting the real story when they are not (55).<\/span><\/p>\n<p><span style=\"color: #333333;\">Now, of course, if one tells students in advance that the rubric isn&#8217;t the full story, and that some of the grading process remains subjective, due to the nature of having experience in the field and knowing what counts as good work, then this particular problem doesn&#8217;t seem so bad. But Sadler goes further than this remedy, which I have already implemented. And it keeps the values of disclosure and openness intact.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #800000;\"><strong>Holistic grading and peer assessment<br \/>\n<\/strong><\/span><\/p>\n<p><strong><\/strong><span style=\"color: #333333;\">He isn&#8217;t suggesting we go back to simply judging works impressionistically and leaving students without a lot of guidance as to how we got to those judgments. Indeed, he supports a <strong>combination of holistic and analytic grading:<\/strong><\/span><\/p>\n<div title=\"Page 13\">\n<div>\n<div>\n<blockquote><p><span style=\"color: #333333;\">To advocate that a teacher should grade solely by making global judgments without reference to any criteria is as inappropriate as requiring all grades to be compiled from components according to set rules. Experienced assessors routinely alternate between the two approaches in order to produce what they consider to be the most valid grade. (57)<\/span><\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n<p><span style=\"color: #333333;\">But it&#8217;s more than that&#8211;<strong>we need to also &#8220;induct students into the art of making appraisals&#8221; themselves<\/strong> (56). To do so is to start &#8220;learners on the path towards becoming connoisseurs&#8221; (56), where connoisseurs or experts are able to recognize quality in particular cases even without being able to give a general definition of quality for those kinds of works, or without being able to give a set of criteria for quality that applies to all such works.<\/span><\/p>\n<p><span style=\"text-decoration: underline; color: #333333;\"><span style=\"text-decoration: underline;\">How to help students become connoisseurs<\/span><\/span><\/p>\n<p><span style=\"color: #333333;\">Clearly, peer evaluation and feedback is key. Three aspects of such activities are highlighted by Sadler: (1) students need to be exposed to a variety of works in the same genre of what they&#8217;ll be producing; (2) they need exposure to works in a wide range of quality; (3) they need exposure to responses to a variety of &#8220;assessment tasks&#8221; (57).<\/span><\/p>\n<p><span style=\"color: #333333;\">Sadler notes that students, as well as instructors, should be using both holistic approaches and analytic approaches to evaluation, focusing on the holistic assessments first and &#8220;only afterwards formulating valid reasons for them&#8221; (57)<strong><\/strong>. I assume this means formulating valid reasons that appeal to criteria that attach to those particular works, since as noted above, experts may not be able to formulate a set of criteria for all such works. This sounds right, as Sadler later goes on to discuss how students and instructors can come up with new criteria to add to their working set as they review more works (58). These new criteria can be shared amongst the class, he notes, but &#8220;not with a view to assembling a master list,&#8221; because one should help students to see the limitations in trying to develop general sets of criteria (58).<\/span><\/p>\n<p><span style=\"color: #333333;\">In another interesting move, Sadler suggests that <strong>a large amount of class time could be devoted to peer assessment activities<\/strong>. Students could be asked to do formative responses to particular tasks related to course content, and much of the class meeting times could be devoted to students reading and commenting on each others&#8217; works. As Sadler puts it:<\/span><\/p>\n<div title=\"Page 15\">\n<div>\n<div>\n<blockquote><p><span style=\"color: #333333;\">In this way, student engagement with the substance of the course takes place through a sequence of produce and appraise rather than study and learn activities. (59)<\/span><\/p><\/blockquote>\n<p><span style=\"color: #333333;\">In the remaining section of the paper, Sadler discusses obstacles to implementing his suggestions, and ways to get around them. I won&#8217;t discuss those here, in the interest of not extending this blog post too much further.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #800000;\"><strong>My thoughts<\/strong><\/span><\/p>\n<p><span style=\"color: #333333;\">I must admit I am warming to the idea of not providing a set of criteria for essays in advance as if they were the <em>only<\/em> things I look for when grading. Still, I already state that the things on my rubric are not exhaustive, so I&#8217;m moving in that direction already. And Sadler notes in this article that &#8220;certain criteria may always be relevant&#8221; to a genre of works (59). He cites things like grammar, paragraph organization and logical development as examples for written essays. I like to think that the things I&#8217;ve put on my rubric are things that are &#8220;always relevant,&#8221; but I guess I&#8217;d need to think about that further. Is it absolutely critical that essays have a clear thesis statement at the end of the intro, and a conclusion that rounds out the essay (for example)? Could there be an A+ essay that doesn&#8217;t have these but is truly excellent in other ways?<\/span><\/p>\n<p><span style=\"color: #333333;\">What I haven&#8217;t been doing is working on helping students to become connoisseurs themselves. I do have <em>some<\/em> peer feedback in my philosophy courses, but usually students only do it once or twice, which may not be enough to really move them along this path (unless they get a lot in other courses as well, which I am not sure of). And I don&#8217;t encourage them to come up with their own criteria for quality, necessarily, but rather to use the rubric I&#8217;ve provided (at least in 1st and 2nd year courses). I guess I think they need guidance in the early years&#8230;how can they know what is a good philosophy essay if this is their first philosophy course? I am still unsure about that one.<\/span><\/p>\n<p><span style=\"color: #333333;\">Perhaps I could give them a pared down rubric, with just those things I do really think are always relevant, and then encourage them to come up with other standards or criteria and share them with the rest of the class, and talk about how complex assessment really is. I could also talk about peer assessment as a way to help them learn to see quality themselves. And I am very intrigued by the idea of having more peer assessment in class, using formative (ungraded) assignments.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><strong><span style=\"color: #800000;\">What do you<\/span><span style=\"color: #800000;\"> think?<\/span><\/strong><\/p>\n<p><span style=\"color: #333333;\">Rubrics are popular; I heard in multiple professional development workshops of their value. Do you think they might be stifling in the ways noted above? Is there anything in Sadler&#8217;s article you agree\/disagree with? If you use rubrics yourself, do you think they&#8217;re valuable in ways not yet mentioned here?<\/span><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div title=\"Page 9\">\n<div>\n<div>\n<div title=\"Page 9\">\n<div>\n<div>\n<p><span style=\"text-decoration: underline; color: #333333;\">Works cited<\/span><\/p>\n<div title=\"Page 19\">\n<div>\n<div>\n<p><span style=\"color: #333333;\">Polanyi, M. (1962). <em>Personal knowledge.<\/em> London: Routledge and Kegan Paul.<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div title=\"Page 19\">\n<div>\n<div>\n<p><span style=\"color: #333333;\">Sadler, D. R. (1981). Intuitive data processing as a potential source of bias in naturalistic evaluations. <em>Educational Evaluation and Policy Analysis<\/em>, 3(4), 25\u201331.<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In an earlier post I discussed a paper by D. Royce Sadler on how peer marking could be a means for students to learn how to become better assessors themselves, of their own and others&#8217; work. This could not only allow them to become more self-regulated learners, but also fulfill roles outside of the university [&hellip;]<\/p>\n","protected":false},"author":665,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[460452,699867,460443],"tags":[460454,460462,1230],"class_list":["post-352","post","type-post","status-publish","format-standard","hentry","category-markinggrading","category-peer-assessment-feedback","category-scholarship-of-teaching-and-learning","tag-marking-rubrics","tag-research-reviews","tag-writing"],"_links":{"self":[{"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/posts\/352","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/users\/665"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/comments?post=352"}],"version-history":[{"count":28,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/posts\/352\/revisions"}],"predecessor-version":[{"id":549,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/posts\/352\/revisions\/549"}],"wp:attachment":[{"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/media?parent=352"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/categories?post=352"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.ubc.ca\/chendricks\/wp-json\/wp\/v2\/tags?post=352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}