Measuring Human Development: What About the Politics of Evaluation?
Rogla, Jennifer (2016). 'Measuring Human Development: What About the Politics of Evaluation?' Paper presented at the annual conference of the HDCA, Tokyo 2016.
How do we understand the politics that are usually at the heart of tough decisions between measurement trade-offs when evaluating human development? We know there is no one-size-fits-all apporach to development, so how do we decide what we will measure given the diverse array of metrics? In their book, Mis-Measuring our Lives: Why GDP Doesn’t Add Up (2010), Joseph Stiglitz, Amartya Sen, and Jean-Paul Fitoussi take an unparalleled stance given their fame as economists: there are many – too many – problems associated with measuring a country’s development via GDP, and alternative proposals are needed. As part of a broader special committee convened by then French President Nicolas Sarkozy, their goal was to understand how we empirically do and normatively should measure economic performance and social progress. The committee focused on the issues surrounding GDP as a measurement tool for human development, how we should measure the broader concept they propose of “quality of life,” and the measurement of environmental indicators and sustainable development. Given previous work by Roodman on the dangers of using macro-measures such as GDP when studying development effectiveness, the book was an excellent step in expanding our metrics to include the voices and experiences of those who are affected by development initiatives.
Their contribution now begs a question very relevant for political scientists: what are the politics of human development evaluation? Mis-Measuring our Lives concedes early on that there are researcher incentives at play: it is simply easier to use GDP as a measurement of development. So how are the incentives for researchers affecting our measurements – new and old – and our subsequent inferences about development successes or failures? To gain the momentum needed to change measurement, we need to demonstrate the current incentives at play in the field that have caused these poor measurements to continue, which in turn negatively impact our policy recommendations. Without understanding this facet of development measurement, progress made on new indicators risks getting stalled in the politics of evaluation.
Thus this project has two main elements. The first asks: what are the incentives of different actors within the development field to develop and/or use new measurements? As data decisions are driven by politics, we need to know if researchers and agencies have incentives to change their measurements. As Sarkozy points out in the Foreword of Mis-Measuring our Lives, development researchers have had the debate over metrics for a long time; we know our indicators have limitations, but we keep using them anyways because it is simply easier. The definition of development remains unclear in the field, journals still publish studies based on economic indicators, and donors continue to fund projects that use old metrics – if they evaluate at all. Plus the normative implications of changing measurements remain unclear, and promise a large ethical debate. Ultimately those who study and evaluate development are motivated to use the concepts that will get their work published, ensure job security, or further relationships that enable diplomatic relations or a continuous flow of donor funds. Indeed, the old approach is still acceptable on these fronts and requires much less time, money, and controversy than creating new metrics. Given these perverse incentives, changing development measurements may not be an attractive venture. Nevertheless, some organizations such as the UNDP have been pushing forward new measurements over the past few decades like the Human Development Index or the Multidimensional Poverty Index. Why? What are their incentives? Why do we see innovations in their organizations and not in others?
The second part of this project seeks to look at actual development evaluations and determine if the way their sponsors measure outcomes is in some way biasing subsequent policy recommendations. The global foreign aid regime provides a timely context for such a study; the OECD is attempting to create and enforce new norms for aid transparency via the “naming and shaming” technique popular in human rights. They publish progress reports based on the Busan Principles created at their most recent High Level Forum on Aid Effectiveness in order to publicize who is increasing transparency and who is not. Nevertheless, compliance is by and large voluntary. Other than the occasional domestic law, there are no international laws or other typical requirements that force donors to evaluate the results of their aid projects nor to publicize those outcomes. Thus donors are choosing which projects to evaluate and which evaluation reports they publish. I propose to do a study looking at publically available foreign aid project evaluations published by various states, NGOs, and IGOs. I would like to compare across evaluations the following variables: indicators used for measurement, organization type of donors (multilateral, bilateral, NGO), timeframe of evaluation, aid sector, overall outcome of the program (met goals or not), and evaluating agency (donor, third party, recipient). I would like to see if there are systematic biases present in published data that could affect future policy decisions. For example, policymakers might be making foreign aid policy decisions based on one type of organization's results that are not representative of all project types or implementation processes that occur.
The results of this study have enormous implications for the developed and developing worlds, and could lead to ultimately obtaining better – and sustainable – measures of development and aid effectiveness, as well as concrete policy recommendations on structuring development projects. Without first acknowledging what the incentives of researchers are, and secondly showing how those incentives are directly resulting in misleading outcomes, we may have little luck in changing how we measure development no matter how innovative the measures may be. Additionally, this study sheds light on the dangers of analyzing development evaluations without exploring how they might be unrepresentative in terms of which project reports get published at all. If the data scholars use to make broad claims about project effectiveness is not representative of all the development projects out there, our causal claims may be severely biased.