Category Archives: Evaluation methods

Dogs and evaluation

One of the things I look forward to in February is the Westminster Dog Show, a mega spectacle of dog owners, trainers, handlers and, of course, dogs. I like dogs, I have two of them. They are the very best dogs I know.

It is instructive to look beyond the crazy haired, hairless, majestic and cuter than ever faces at how after many rounds of judging one dog is named best in show. Investigating evaluation systems that are distinct from our own contexts of application provides a moment for reflection and perhaps learning.

Here’s what the WKC says about judging:

Each breed’s parent club creates a STANDARD, a written description of the ideal specimen of that breed. Generally relating form to function, i.e., the original function that the dog was bred to perform, most standards describe general appearance, movement, temperament, and specific physical traits such as height and weight, coat, colors, eye color and shape, ear shape and placement, feet, tail, and more. Some standards can be very specific, some can be rather general and leave much room for individual interpretation by judges. This results in the sport’s subjective basis: one judge, applying his or her interpretation of the standard, giving his or her opinion of the best dog on that particular day. Standards are written, maintained and owned by the parent clubs of each breed.

So come February there are successive rounds of dog judging leading to that final moment, naming the best dog. First, there are competitions within breeds to determine the best lab or poodle (standard, miniature and toy) and so on. To make the next round of judging manageable, breeds are then grouped… there are seven groups: sporting, non-sporting, hound, working, terriers, toy, and herding. This grouping is really a matter of convenience; the groups make sense but they are not mutually exclusive. For example, terriers could be working dogs if hunting down vermin were considered work or terriers could be sporting dogs if hunting vermins were considered sport and clearly at least some of the terriers are small enough to be considered part of the toy group.

The grouping makes sense, but it isn’t a key feature of the evaluation process because in this round of judging dogs are not compared to one another, but to their own breed standard. For example, in the non-sporting group the Bichon Frise is not compared to the Dalmatian even though they are in the same group. The judge is making a judgement about each dog in relation to the breed standard and declaring that one (actually four dogs at this round) are the best examples of their breed. So if the Bichon wins in the group it means that the Bichon is an excellent Bichon and the Dalmatian is a less of an excellent Dalmatian.

The last round of judging to find the best dog looks at the best in each of the groups. Within the groups and at this culminating stage what continues to be notable is the variation, the dogs don’t appear to be ones that belong together. The judge again is comparing each dog to its breed standard, choosing the one that best meets the standards for its own breed.

So, dog judging is a criterial based evaluation system. This example reminds me of the protest thrown in the way of evaluation: “That’s comparing apples and oranges!” with the implication that doing so is unfair or even impossible. The WKC show illustrates that this is a false protest and that we can judge given an apple and an orange which is better, but that to do so requires a clear articulation of what makes an apple a good apple and what makes an orange a good orange. I tell my students that we make such evaluative judgements all the time, even when we are deciding whether to buy apples or oranges.

Grading each dog is a separate evaluation and then the grades of each dog are compared to one another–the dogs are not compared, the grades are. Now this grading and comparison isn’t made explicit and happens in the mind of the judge, presumably experienced evaluators. But, conceivably this stage of the evaluation could be made explicit. That it isn’t is a part of the evaluation procedure that assigns confidence in the evaluators’ expert knowledge–we don’t expect it to be explicit because we don’t need it to be in order to trust the judge/evaluator. It is a criterion based evaluation system that includes some considerable exercise of expert judgement (criterion based evaluation meets connoisseurship, perhaps).

And the winner is…

Youth participatory evaluation ~ resources

A good stop for resources on YPE is Act for Youth. YPE is described as:

an approach that engages young people in evaluating the programs, organizations, and systems designed to serve them. Through YPE, young people conduct research on issues and experiences that affect their lives, developing knowledge about their community that can be shared and put to use. There are different models of YPE: some are completely driven by youth, while others are conducted in partnership with adults.

A list of resources points the reader to other literature on YPE.

youth focused evaluation meets MSC meets video

The Video Girls for Change project uses the Most Significant Change approach to evaluate programming for girls in various developing countries, and uses video summaries to communicate the findings from the evaluations.

The Most Significant Change technique is a form of participatory monitoring and evaluation that directly involves the voices and perspectives of beneficiaries. Essentially the Most Significant Change process involves the collection of stories of significant change from the field, followed by the systematic selection of the most significant of these stories by panels of designated community members and other stakeholders.

Participatory video is an accessible, flexible medium for recording community stories of change. With InsightShare’s games and excercises and experiential learning approach participants can rapidly learn video skills, allowing people to tell their Most Significant Change stories in a familiar context and to someone they trust. The process itself is fun, direct and the results can be played and reviewed immediately. It also helps to avoid situations where project staff or external evaluators speak on behalf of communities, allowing intended beneficieries to speak for themselves.

When participatory video and the Most Significant Change technique are skilfully brought together, the human stories behind development projects can be captured in an accessible form, even for those with low levels of literacy. These combined methodologies promote peer-to-peer learning, collective reflection, triangulation and wide distribution of these important stories. Participatory video allows for everyone to get involved, contribute, feel, and respond to, other people’s stories and can strengthen community ties and identification with developmental objectives.

Meta-evaluation example

Meta-evaluation is obviously a good idea that too often is foregone because of limited resources (especially when the evaluation may often be underfunded). Large, high profile, and expensive evaluations of equally large, high profile and expensive initiatives are more likely to incorporate meta-evaluation to assure that the evaluation meets accepted quality standards, and that evaluative judgements and conclusions are trustworthy. Such an example is the meta-evaluation of the evaluation of the Paris Declaration on Development Aid. The evaluation received the 2012 Outstanding Evaluation Award from the American Evaluation Association. The evaluation looks at how the principles of aid effectiveness have been put into practice by international development partners and the results this is having in developing countries. The international joint evaluation include a synthesis report, 21 country evaluations, 7 donor studies and several thematic reviews.

The meta-evaluation, conducted by M. Q. Patton identifies strengths, weaknesses and lessons learned from this large scale evaluation.

e-learning about Developmental Evaluation

A good opportunity to explore this hot topic in evaluation…

Free, but registration required. For more information go to

UNICEF, Claremont Graduate University and IOCE, under the EvalPartners initiative, with support from The Rockefeller Foundation and in partnership with UN Women, are pleased to announce a new introductory e-Learning programme on Development Evaluation.
The e-learning is composed of the following three courses:

Equity-focused evaluations
National Evaluation Capacity Development for Country-led Monitoring and Evaluation Systems
Emerging Practices in Development Evaluation

The instructors are 33 world-level specialists, including:
International experts, including Michael Quinn Patton, Michael Bamberger, Jim Rugh, David Fetterman, Patricia Rogers, Stewart Donaldson, Donna Mertens, Jennifer Greene, Bob Williams, Martin Reynolds, Saville Kushner and Hallie Preskill
Senior representatives of the international community, including Caroline Heider, Belen Sanz,
Indran Naidoo, Fred Carden, Hans Lundgren, and Marco Segone
Senior managers responsible for country-led M&E systems, including Sivagnanasothy Velayuthan and Diego Dorado
Leaders from the Global South and BRIC countries, including Zenda Ofir and Alexey Kuzmin

The e-learning is free and open to all interested evaluators. You may attend virtually from your personal or work computer anywhere in the world. The course includes on-line lectures, reading material and tests. Participants will have the opportunity to engage in an on-line forum, and on successful completion of the e-Learning course will be able to print out a certificate of virtual attendance.

some ideas for stakeholder engagement

While this document from the IBM Center for the Business of Government Using Online Tools to Engage The Public, focuses on strategies for governments to engage the public in issues, there are a number of strategies and ideas that make sense for evaluators thinking about how to engage stakeholders in the evaluation process. The discussion focuses on the use of online strategies, a particularly useful focus when evaluators are working across sites and geographic areas.

There are 10 strategies in all, 4 for collaboration, 2 for surveying perceptions and opinions, and 4 for prioritizing ideas offered, complete with specific suggestions for software and an illustrative example of each.

Evaluation as covert operation… GAO goes all Jack Bauer!

Screen shot 2010-03-30 at 7.09.15 AMOne of the things I like about evaluators is the ingenuity they demonstrate in finding ways to answer evaluation questions. The recently released GAO evaluation of the Energy Star program is delicious… and subversive. How else to evaluate the Energy Start Program than to submit bogus products for review and see what happens? That’s what the GAO did and the results are funny, and not so funny. In just eleven days the GAO received an energy star approval for their “room air cleaner” (see pic). Also approved were the gasoline-powered alarm clock, metal roof panel, and light commercial HVAC. Take away message: the Energy Star program is a cheap and easy seal of approval that probably means nothing in many cases… think twice before paying extra for that Energy Star seal!