Category Archives: evaluation use

Does Evaluation Contribute to the Public Good?

In September I was honoured to give the initial keynote address at the 2017 Australasian Evaluation Society meeting in Canberra. I am thankful for the opportunity and for the warm response my keynote received.

I express my pessimism, maybe even cynicism, about the extent to which evaluation has contributed to the public good, by which I mean the well-being of all people, globally, manifested in things such as food security, healthcare, education, clean water, adequate housing. I offered some hopeful suggestions about how evaluation as a practice might do better in its contribution to the public good.

This talk has been translated to French and has been published in La Vigie de l’évaluation and can be accessed here. It will soon be published in English and I will post a link here soon.

I also appreciate the media coverage this talk received in the Mandarin, an independent online newspaper devoted to government policy and practice in Australia. Click here for a link to that story, “Whoever Heard of an Independent Evaluation Keynote Tell It Like It Is?”


middle schools: pros and cons

Screen Shot 2016-03-01 at 10.26.02 AMA short radio interview this morning in response to a local school district decision on middle schools. Like some other interventions (DARE is a good example) the evidence isn’t strong for middle schools although there is still a commitment to the idea. While that’s not necessarily a bad thing one would hope that arguments for or against a middle school model are explicit, nuanced, and responsive to local community and school needs


getting the most from formative evaluation

While the distinction between formative and summative evaluation is often drawn too sharply, there is a real distinction. For formative evaluation to really be formative, there needs to be a steady flow of evaluative feedback that allows for corrections along the way… that is, to do what ever is being done better, both in the present and into the future.

Compare two approaches to formative evaluation ~ real time evaluation and digital portfolios of student learning.

Real Time Evaluation

Screen Shot 2015-07-26 at 9.31.55 AM

An evaluation approach that captures this is “real time evaluation,” an idea that appears often in humanitarian relief efforts. With a disastrous situation that demands rapid alleviation of suffering comes the need for quick information about whether that suffering is indeed being alleviated, and if it isn’t then what might be done in order to do so. RTE emphasizes timely evaluation feedback to strengthen program design and implementation, some common features are:

  • RTE takes place during implementation
  • is iterative
  • short time-frame is short, days not weeks
  • relies on secondary sources of information, but also field visits
  • use internal ‘consultants’
  • emphasis on process and immediate lesson-learning
  • ‘quick and dirty’ results enable quick program changes

Digital Portfolios of Student Learning

While traditional report cards have long been the mainstay in reporting student learning technology that allows for ongoing feedback about what and how students are learning are now common. Digital portfolios are collections of evidence managed by users and shared electronically, often on the web but increasingly through other social media platforms. One example is Fresh Grade, an app that facilitates documenting and sharing learning activities and outcomes. Common features of digital portfolios are:VT43.36 Master.indd

  • user driven (usually students, but also increasingly teachers)
  • shared digitally
  • ongoing representation of learning
  • includes direct evidence
  • keyed to stated criteria and standards
  • modifiable as an end product, summative evaluation

What can we learn from these examples?

RTE is often done under difficult circumstances with limited ability to collect data first hand and thus is content with ‘quick and dirty’ results. Disaster situations make it onerous to be in the field and evaluation relies on reports from the field (observations of aid workers, sector staff in the area, and so on). On the other hand, classrooms and other educational settings are easy to access, but the data about learning activities and outcomes are similar to reports from the field. Digital portfolios and especially the real time apps (like Fresh Grade) provide immediate evidence of what is going on and what is being accomplished. Apps allow students and teachers to create and share information on an ongoing basis, but permit editing and adding to the record over time. If we think about an individual student’s learning as a model for a program, perhaps this technology has something to offer formative program evaluation.

RTE could use an app or web based platform (most are available for smart phones and tablets, and there are a number of web-based tools that might serve this purpose: Evernote, Google drive, Three Ring) so those on the ground could provide data about what is happening by sending photographs, interviews, observations, documents, and so on to evaluators who are unable to collect data firsthand. Connectivity may be an issue in some situations, but even erratic connection would allow for varied and compelling data to be shared. In non-emergency situations this wouldn’t be a problem. Technology that allows for sharing information easily and often may increase the likelihood adjustments can be made and thus the purpose of formative evaluation realized.


Participation in Humanitarian Evaluation


Chris Morris guest blogged on the Impact Ready blog about some findings from his research on the role of evaluation in accountability in interventions and programs in crisis effected communities. He focuses in this blog post specifically on the lack of participation by local communities in evaluations that are meant to provide accountability to those most effected. Click here to read the whole post.

alternatives to standardized testing

67547_10151545229449501_624032542_nIn educational evaluation the global educational reform movement (GERM) has privileged common indicators of student learning outcomes (used in turn for other evaluation purposes like teacher evaluation, even if not a sound practice). There are many reasons why standardized tests become the norm and are reified as the only fair and legitimate way to know how students and schools are doing. There is plenty of literature that debunks that idea.

However, the narrative of standardized testing as a necessary and legitimate means of judging the quality of learning and schooling is powerful and political. In a short commentary for NPR a reporter, Anya Kamenetz, nicely summarizes reasonable alternatives, and these are excellent talking points when confronted with the question, “If not standardized tests, what then?” You can read the article, but in summary:

1) use some sort of matrix sampling (a good idea from NAEP)

2) consider ongoing embedded assessments (this is usually computer based testing)

3) think about what you want to know and it will require multiple measures (in other words, knowing scores in a few subject areas will never be enough, and maybe there are things worth knowing beyond the obvious)

4) start considering novel approaches to assessment, like game based assessment and the not so novel use of portfolios or narrative evaluations

5) think differently about what it means to judge a school and that means looking at more than just narrow student outcomes (school inspections are a notion worth revisiting).

learning to be an evaluator ~ many ways, many contexts

FFA-competition-JB-3-218x300For many of us we naturally think about learning to do evaluation within the context of degree programs, professional development workshops, and sometimes on the job training. In so doing education in evaluation is seen as more limited than is practically the case. Because evaluation is perhaps one of the most common forms of thinking (whether it is done well or not) there are a dizzying array of contexts in which people learn to make judgements about what good is.

Yesterday, hundreds of young people gathered in rural North Carolina to demonstrate their evaluation skills… in dairy cow judging.

participants are scored based on how well they apply dairy cattle evaluation skills learned in the classroom. Each team evaluated six classes of dairy cattle and defend reasoning for evaluation to a panel of judges

While future farmers of America may do cow judging in preparation for careers as future dairy farmers, historically the evaluation skills demonstrated were key to selecting the best, most productive and healthy herd upon which the farmer’s livelihood depended.

The new symbolic evaluation

Recently I bought two new cars, and have been on a vacation where I stayed in numerous hotels, flew various airlines, and took a cruise. In almost every instance I was asked to do an evaluation of the product and/or service. As an evaluator, it is too easy to assume that requests for evaluation are genuine, and companies want feedback to improve their product/service. I’ve come to understand these requests are primarily symbolic.

First, the cars. Buying an expensive car, I was told that the company would send an evaluation form, but was also asked to give all 5’s on the five-point likert-scale since the company considered anything less than that a failure. My salesperson, who was very good, told me he would be investigated if he received anything less than a perfect score. I thought he did a really good job; I was happy. But everything wasn’t perfect and in a true evaluation I would have likely made some suggestions for how the experience could have been improved for me. I didn’t though–I gave the experience a perfect score.

UPDATE on this product:
Recently this car required servicing and I experienced pretty poor customer service. Answering the customer service satisfaction survey honestly unleashed a stream of email and telephone responses from the dealer.

Since December 11, 2010 you have been the owner of a 2011 BMW 550i purchased through XXX BMW and it seems that we have certainly missed the mark with your most recent service experience. We received a score from you of 78.6%. Our goal is 100% and we as a business are very interested in what we can do to get there.

If you don’t mind giving me a quick call it would be very appreciated so that I can express our expectations of our staff. Us not keeping you well informed is not the XXX BMW way and unacceptable by our standards. If you could possibly email me or call me in the future before filling out these surveys if anything was not 100% it would be appreciated as well. Our goal is to exceed your expectations with every interaction.

Looking forward to hearing from you so that we can turn your ownership experience around.

I received this email and a phone message, even though I indicated on the survey I did not wish to be contacted… I considered my responses clear and straightforward. But as the email above indicates, clear and straightforward feedback was less the issue than getting me right with how these surveys ‘ought’ to be done. So, even though I had nothing else to say, I emailed the customer service agent back, and received yet another email back. The evaluative process began to feel like harassment.

Buying a modest priced car, I also received an evaluation form. This one I completed honestly, and there were a number of issues (none fatal flaws) that would have improved the experience for me. No one at this car company told me I should give perfect scores, but the results of my honest evaluation might suggest they should have. A few days after completing the evaluation of this car buying experience I received a phone call from the person who handles the financial part of buying the car. I was driving in my new car at the time, and loved that I could answer the call directly from my car–that’s cool! The person calling me began chastising me for the responses I gave on the evaluation, demanding to know more about why I wasn’t perfectly satisfied with the experience. The fact of the matter is that this person was pushy about purchasing extras like replacement insurance and extra warranties–I thought it a hard sell and wondered if there were commissions involved. This is the feedback I provided. I reiterated to the finance person how I felt about the experience, she continued to harangue me, eventually reaching a point of yelling at me. At this point, I terminated the call, which I could conveniently do by pressing a button on the dashboard screen in my car! Formative feedback was not what this person or car company wanted.

My experience after a month of travel that involved planes, buses, ships, and hotels was pretty similar to the car experiences. An invitation to evaluate services/products wasn’t entirely genuine, but it was important for these companies to look as if they cared about customer satisfaction. I admit I likely take evaluation more seriously than your average person, but still I am impressed by the integration of largely symbolic evaluation in these corporate cultures. To evaluate has become standard operating procedures, but the practice has not matured to using the findings of those evaluations in truly formative and summative ways.