ABSTRACT

Action learning has benefits at individual, organisational and social levels, but it is important, although challenging, to identify these and to learn what works well and what does not.

What is evaluation for ?

Evaluation is about learning and the consequential taking of appropriate action. It has been noted (1) that “We cannot regard truth as a goal of inquiry. The purpose of inquiry is to achieve agreement among human beings about what to do, to bring consensus on the end to be achieved and the means to be used to achieve those ends. Inquiry that does not achieve coordination of behaviour is not inquiry but simply wordplay”. Likewise, a survey identifying learning from evaluation studies (2) concluded that “Evaluation can only ever provide good quality information to inform decision-making. It is unlikely to supply ready-made answers because the results will need to be interpreted as part of a process of discussion and judgement, with the views of different stakeholders and the intended outcomes of the activity taken into account”. It is concerned with how we generalise from the past into the future (3). Evaluation increasingly takes place in an organisational and social setting where “nothing is clear, and everything keeps changing” (4) because there will be large parts of the system being evaluated that are not known or understood, at least at the outset. So, the notion of “rational” or straightforward application of evidence in making choices is necessarily either flawed or naïve. A whole range of factors impact on human behaviour, so trying to narrow it down in this way is an enormous over-simplification. Evaluation involves trade-offs, often between competing values and judgement calls. These are normative debates, where facts and values interact. So, evaluation is about creating opportunities for stepping back, reflecting, learning and sense-making, and through this achievement of greater awareness, understanding and action. Accordingly, evaluation activity is an intervention itself.

Problems with evaluation

There is almost universal agreement on the importance and value of evaluation, but it is important to identify at the outset a distinction between two major types:

Summative (or judgemental) evaluation is concerned with justifying the investment in (mostly) financial terms and with assessing the overall outcomes – the measurable impact or contribution, and so is more likely to be valued by funders and budget-holders, to rely on “hard” data and to require quick answers. The major concern will be to review costs and benefits and so ensure value for the investment made.

Formative (or developmental) evaluation is concerned with improvement and steering and so serves to reinforce learning. It is focused on process, rather than outcomes, so is more likely to be favoured by those concerned with individual and organisational development, who value the rich information accrued, including the impact of context or setting on learning.

The distinction between the two types is shown in Table 14.1.

The quantitative approach exemplified by summative evaluation may not just be undesirable for certain human activities, but perhaps even impossible (5). To measure anything an objective yardstick is needed such as centimetres for length or kilometres for distance. Human activity at work involves a range of complex tasks that are highly context-dependent, so it may well be a fallacy to believe that such activity can be measured objectively using a yardstick and so result in hard figures. Measurement of this kind tends to use Likert scales, which involve respondents rating statements by selecting from a range of possible responses (such as poor, adequate, good and very good) or figures (often as −2, −1, 0, +1 and +2). These are intuitive approximations based on subjective criteria and so any translation of results into figures serves to create a false impression of objective quantifiability.

There are two major summative evaluation approaches – the Kirkpatrick Four Levels model (6) and the Return on Investment (ROI) methodology (7).

With the ladder-like or “chain reaction” Kirkpatrick model evaluation is considered at four levels:

Reaction : This asks to what degree participants reacted favourably to the learning event in terms of their thoughts and feelings. It is the level of the “happy sheet” administered to a captive audience at the end of an event and is the most common form of assessment used. It gives a brief glimpse of how learners intend to apply their learning, but the findings at this level are rarely followed up.

It is then slightly more difficult, but not impossible to collect data.

Learning : This asks to what degree participants acquired the intended knowledge, skills and attitudes, based on their participation in the learning event.

It is then even more problematic to collect data on resulting changes in behaviour.

Behaviour : This asks to what degree participants apply what they have learned from the learning event when back in the work setting.

It is then very difficult indeed to collect data.

Outcomes or results : This asks to what degree desired outcomes occur because of the learning event – the effect of changed behaviour on performance.

The movement from reactions to outcomes introduces a significant number of intervening variables (or other things happening to the individuals and/or the organisations concerned) which make it difficult to ascribe a simple cause-and-effect relationship. It is thus very difficult to measure learning transfer – what learners attempt to apply when they return to their normal work environment (8). Moreover, the correlation between the levels is weak – a positive result at one level does not necessarily lead to a positive result at the next. By concentrating on the relatively easy-to-assess participant reactions, the tendency is to sideline the contextual factors that might affect an event and its impact.

In the ROI approach, the key feature is the calculation of the monetary value of investing in an activity. The outcomes of an activity are converted into a financial value, enabling a cost-benefit analysis to take place. Those results which cannot be monetised are called “intangibles”. Only financial quantitative data really matters and intangibles, as evidenced by qualitative data, are relegated to a secondary role. As a result, ROI risks either minimising such results or forcing an essentially hypothetical and subjective financial value on them. Yet such intangibles can clearly lead to significant benefits over time (9). While calculating the costs of a programme are relatively easy, if somewhat time-consuming and disputable, a real challenge lies in defining the gains made (10).

Both approaches are based on a supposedly scientific approach – set objectives, apply intervention, isolate effects of intervention and measure results. Yet how appropriate is this where there are multiple variables and where human agency plays a major role in determining outcomes?

There are other problems associated with evaluation (11), including:

Time : How to evaluate initiatives over the short term that are intended to have much longer-term impacts?

Context : What works in one setting may not necessarily work in another. Factors such as culture, values, timing, priorities and attitudes all play their part.

Complexity : Given the multiple intervening variables, how can the effect of one activity be disentangled from the others, especially when they may overlap?

Value : What counts as “success” ? What is valued and by whom? Who are the key stakeholders and what are they seeking to achieve?

Horses for courses : The size and complexity of an evaluation exercise needs to be in proportion to the activity being evaluated and the form the evaluation takes should, in turn, reflect the values underlying the activity itself.

Cost : Conducting evaluation is not cost-free. It will involve additional work, with associated costs, whether sourced internally or externally. Ideally, evaluation and the associated costs should be planned for at the outset, but in practice, this rarely happens, and evaluation is usually conducted retrospectively, often as an afterthought.

Politics : Evaluation is a complex and highly political process. Policy decisions are sometimes made despite evidence of what does or does not work in practice. Evaluation can be used to gather data that supports a particular policy direction – policy-based evidence, rather than evidence-based policy. If evaluation work results in an “unacceptable finding” and is sensitive politically, they may never see the light of day and may be ignored. Decisions are made on much more than evaluation evidence – values, interests, personalities, timing, circumstances and happenstance all also play their parts.