language-icon Old Web
English
Sign In

Impact evaluation

Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, 'a comparison between what actually happened and what would have happened in the absence of the intervention.' Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program. Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, 'a comparison between what actually happened and what would have happened in the absence of the intervention.' Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program. Impact evaluation helps people answer key questions for evidence-based policy making: what works, what doesn't, where, why and for how much? It has received increasing attention in policy making in recent years in the context of both Western and developing countries. It is an important component of the armory of evaluation tools and approaches and integral to global efforts to improve the effectiveness of aid delivery and public spending more generally in improving living standards. Originally more oriented towards evaluation of social sector programs in developing countries, notably conditional cash transfers, impact evaluation is now being increasingly applied in other areas such as the agriculture, energy and transport. Counterfactual analysis enables evaluators to attribute cause and effect between interventions and outcomes. The 'counterfactual' measures what would have happened to beneficiaries in the absence of the intervention, and impact is estimated by comparing counterfactual outcomes to those observed under the intervention. The key challenge in impact evaluation is that the counterfactual cannot be directly observed and must be approximated with reference to a comparison group. There are a range of accepted approaches to determining an appropriate comparison group for counterfactual analysis, using either prospective (ex ante) or retrospective (ex post) evaluation design. Prospective evaluations begin during the design phase of the intervention, involving collection of baseline and end-line data from intervention beneficiaries (the 'treatment group') and non-beneficiaries (the 'comparison group'); they may involve selection of individuals or communities into treatment and comparison groups. Retrospective evaluations are usually conducted after the implementation phase and may exploit existing survey data, although the best evaluations will collect data as close to baseline as possible, to ensure comparability of intervention and comparison groups. There are five key principles relating to internal validity (study design) and external validity (generalizability) which rigorous impact evaluations should address: confounding factors, selection bias, spillover effects, contamination, and impact heterogeneity. Impact evaluation designs are identified by the type of methods used to generate the counterfactual and can be broadly classified into three categories – experimental, quasi-experimental and non-experimental designs – that vary in feasibility, cost, involvement during design or after implementation phase of the intervention, and degree of selection bias. White (2006) and Ravallion (2008) discuss alternate Impact Evaluation approaches. Under experimental evaluations the treatment and comparison groups are selected randomly and isolated both from the intervention, as well as any interventions which may affect the outcome of interest. These evaluation designs are referred to as randomized control trials (RCTs). In experimental evaluations the comparison group is called a control group. When randomization is implemented over a sufficiently large sample with no contagion by the intervention, the only difference between treatment and control groups on average is that the latter does not receive the intervention. Random sample surveys, in which the sample for the evaluation is chosen randomly, should not be confused with experimental evaluation designs, which require the random assignment of the treatment. The experimental approach is often held up as the 'gold standard' of evaluation. It is the only evaluation design which can conclusively account for selection bias in demonstrating a causal relationship between intervention and outcomes. Randomization and isolation from interventions might not be practicable in the realm of social policy and may be ethically difficult to defend, although there may be opportunities to use natural experiments. Bamberger and White (2007) highlight some of the limitations to applying RCTs to development interventions. Methodological critiques have been made by Scriven (2008) on account of the biases introduced since social interventions cannot be fully blinded, and Deaton (2009) has pointed out that in practice analysis of RCTs falls back on the regression-based approaches they seek to avoid and so are subject to the same potential biases. Other problems include the often heterogeneous and changing contexts of interventions, logistical and practical challenges, difficulties with monitoring service delivery, access to the intervention by the comparison group and changes in selection criteria and/or intervention over time. Thus, it is estimated that RCTs are only applicable to 5 percent of development finance. Quasi-experimental approaches can remove bias arising from selection on observables and, where panel data are available, time invariant unobservables. Quasi-experimental methods include matching, differencing, instrumental variables and the pipeline approach; they are usually carried out by multivariate regression analysis. If selection characteristics are known and observed, they can be controlled for to remove the bias. Matching involves comparing program participants with non-participants based on observed selection characteristics. Propensity score matching (PSM) uses a statistical model to calculate the probability of participating on the basis of a set of observable characteristics and matches participants and non-participants with similar probability scores. Regression discontinuity design exploits a decision rule as to who does and does not get the intervention to compare outcomes for those just either side of this cut-off.

[ "Economic growth", "Statistics", "Pathology" ]
Parent Topic
Child Topic
    No Parent Topic