Measuring cognitive sophistication of tasks or test questions

The challenge

So … you are working to change the class work, homework or assessments for your students so they involve more sophisticated thinking. Maybe you want them to critique a reading. Maybe you want them to make judgements about complicated situations. Maybe you want them to apply skills in a setting they haven’t yet seen?

How would you compare student capabilities before implementing these initiatives to after? You can’t simply carry out pre-post testing because the tests will look very different. You could develop the new learning tasks, make tests that are aligned with those tasks, then assess students using those tests BEFORE implementing the new learning tasks. But that is (a) not really fair on students and (b) would take a whole extra term (or year) to implement.

Evaluating the impact of such an innovation is a common challenge we face as we move from traditional teaching practices to more evidence-oriented ways of facilitating learning. One possible solution is to NOT compare test results of students, but to measure the sophistication of the assessment. Then – if students are (for example) doing OK on this new assessment AND the assessment can be shown to be more sophisticated, THEN success can be reported!

BUT – the challenge is measuring sophistication of learning (or assessment) tasks. Most commonly it is very difficult to get consensus, even among close associates, on whether a learning or assessment task is “more” or “less” sophisticated. The rest of this post outlines one approach to “formalizing” the measurement of task sophistication.

One possible solution

The “Bloom’s Dichotomous Key” or BDK was developed as part of the EOAS Flexible Learning project in Fall 2014 as a means of judging whether a task or test question causes students to engage in higher or lower order cognitive skills. It isn’t about “difficulty” because there can be difficult lower order (eg memory-based) tasks and easy synthesis or creative tasks.

This effort was based on work done by Casagrand and Semsar in the Dep’t of Integrative Physiology at U. of Colorado, Boulder, but we adapted it for use in geoscience, and based on repeated application by a science teaching and learning fellow and a graduate teaching assistant.

This link provides a one-page flow chart for applying the key. It is “dichotomous” because Blooms level is arrived at by repeatedly considering yes/no questions about what students are being caused to do. The other two pages provide notes and guidelines plus a simplified flowchart figure. The tool is not officially published, but results have been employed as data for several presentations and workshops, both peer reviewed and not.

See the three-page PDF here: bdk-geoscience.

P.S. – it does take practice to use this tool. To ensure inter-rater reliability, we recommend having at least two or three people work through a set of tasks and compare results, then argue about differences, before relying on any one person’s conclusions. Also, if possible, it is safer to compare BDK results from two tasks based on one person’s ratings of both, rather than comparing BDK results from two tasks rated by two different people.

Important note

It is important to realize that the BDK is not for measuring the DIFFICULTY of task. It is perfectly possible to have a very difficult “recall” (lower order cognitive skill) task and a rather easy “judgement” task (higher order cognitive skill). Difficulty has to be measured by observing performance, and there are multitudinous reasons for any task to be either difficult or easy. The BDK is all about determining the Blooms level of a task, not it’s difficulty.