Statistical and data literacy in policy-making
Abstract
This introduction offers conceptual reflections to frame the special stream on statistical and data literacy in policy-making. It discusses the relevance of the use of statistics and data in politics and highlights their impact on policy-making. It underlines the need for and identifies key meanings of statistical and data literacy in policy-making. It also highlights how statistical and data literacy in policy-making is specific. Finally, it presents the individual contributions to the special stream that originate from the ISI World Statistics Congress 2021 Invited Paper Session on ‘Statistical and Data Literacy in Policy-Making’. The session was co-organised together with the Director of the IASE’s International Statistical Literacy Project (ISLP), Reija Helenius, to whom we are extremely grateful for linking our activities to the ISLP.
1.Setting the scene for the special stream𝟏
In the shadow of post-factual contested politics, interest-driven political reasoning and partisan communication, legitimate policy-making depends more than ever on the capacity of political actors to make decisions based on transparent and accessible evidence. Adhering to the principles of evidence-informed policies, reliable information thus plays an indispensable role in democratic politics and decision-making. Within it, statistics and data22 are key manifestations of quantitative evidence about the state and society. They are hence essential tools of evidence-informed policy-making.
With the ‘evidence turn’ in policy-making [1, 2, 3, 4, 5, 6, 7, 8], statistics and data have become instruments of collective political action. They measure social reality, offer insights into progress in development and have the potential to identify correlations and trends relevant to informing policy design. As such, they are used to derive at, explain and justify policy choices. This link between statistics, data and politics turns quantitative evidence into a central policy instrument and makes its use a specific feature of evidence-informed policy-making.
Such use of statistics and data in democratic policy-making is subject to the principles of legitimacy, transparency and accountability, but also to preference-building and negotiation. Within policy-making, statistics and data become subject to scrutiny. The political power that emerges from their use is open to contestation over the choice and quality of the evidence at hand. In this way, the use of statistics and data supports deliberation in political processes and fosters participatory structures between data producers and users that engage in promoting quantitative evidence as opportunity structures. The former do so to support evidence-informed policy-making, the latter to define and defend policy priorities. Through these systemic interlinkages, the use of statistics and data becomes open to multiple, strategic and value-based considerations that support, defend and/or contest quantification choices as much as political interests [9].
Accompanying these developments, the landscape of data sources and the ways of using statistics and data in policy-making have grown significantly and become easily confusing for any non-data scientist engaged in public policy. These developments nurture literacy requirements that impact on and are impacted by political actors’ capacity to select, evaluate and process data. Understanding statistics, working with data, analysing and arguing through data, therefore, became essential challenges for professionals working in public policy analysis, policy preparation, policy-making, decision-making, evaluation and scrutiny. To improve the impact of statistics and data on politics, it is hence important to analyse what constitutes statistical and data literacy in policy-making and how it can be promoted.
For the purpose of this special stream, i.e. to inspire conceptual reflection on the topic, we define ‘actors engaged in policy-making’ to include public officials engaged in policy preparation and implementation as well as political decision-makers. We also pay attention to citizens as principals and recipients of public policies and the addressee of political communication. Despite this broad perspective, we are aware of the differentiation of literacy needs depending on the role and position of actors in policy-making and on their contribution to the policy process. Throughout their analysis, the contributions to the special stream reflect aspects of such differentiation. For a further categorisation of statistical and data literacy demands according to actors’ roles and positions see Schüller in this special stream and in her previous work [10].
2.The meaning of statistical and data literacy
Contemporary political and professional life requires distinct types of skills and literacies related to statistics and data. These include among others information and media literacy; communication and visualisation literacy; digital literacy; computational and machine learning literacy; statistical and ethical literacy.
An ever-increasing group of individuals and professionals need to be statistics and data literate. Apart from statisticians, these are – to name only the most obvious ones – policy, business and data analysts; data scientists; data (protection) officers; data architects and database administrators. The skills they require range from data collection and production; to data analysis, interpretation and visualisation, trends analysis and predictive analytics; to development of data strategies, creation of institutional data cultures and monitoring of data protection regulations.
These literacy needs are yet not only relevant for professionals dealing with statistics and data. Also citizens require statistical and data literacy to assess the validity and accuracy of quantitative evidence used in political and media communication and, ultimately, to judge and control the quality of democratic political decision-making [11, 12]. As Engel correctly concluded, an “enlightened citizenry that is empowered to study evidence-based facts and that has the capacity to manage, analyze and think critically about data is the best remedy for a world that is guided by fake news or oblivious towards facts” [13]. In the same vein, Sharma underlines that “[h]aving a good grasp of social statistics can help citizens deal with a complex array of issues and participate actively in public debates and assert their rights” [14] and Koga recommends that “citizens must be capable of evaluating such statistical information thoroughly before making any decisions” [15].
When it comes to defining statistical and data literacy, a uniform definition is missing [see [14, 15]] and there is a huge body of literature to consult to identify its meaning [see, among many others [12, 14, 18]]. In a narrow understanding, data literacy includes the ability to critically read, analyse, process, interpret and argue with data. A broader perspective captures knowledge, skills and value dimensions and also relates to methodological, technical, and socio-cultural capacities. It embraces ethical concerns about data production and use; the ability to preserve and protect data; to communicate data in a contextualised manner; and to understand the quality of data, data sources, methodologies, analytics, techniques, and technologies applied to analyse data. The latter elements link to digital literacy which is discussed by Schüller in this special stream.
In view of combining aspects of statistical and data literacy, within the special stream, we follow the assessment of Gould who includes data literacy in an augmented definition of statistical literacy to reflect the increased role and relevance of data in everyday life. While he argues that the set of knowledge defining statistical literacy often includes a functional differentiation between “the needs of consumers of statistics from those of producers of statistics” [12], he underlines that such definition “falls far short …of what is required for life in modern democracies” [12]. He, therefore, proposes an extended definition of statistical literacy that includes elements of data literacy, such as “understanding who collects data” and how; “understanding issues of data privacy and ownership”; “create basic descriptive representations of data to answer questions about real-life processes”; and “understanding the importance of the provenance of data” [12]. The contributions to this special stream reflect the extended role of both statistics and data in society and policy-making and relate to Gould’s augmented definition of statistical literacy.
3.Is statistical and data literacy specific in policy-making?
In view of omnipresent and seemingly omnipotent technologies and advances in open data, Giovannini concludes that “the ideal of the ‘fully informed decision maker’ should be a reality” [19]. He yet also highlights that “[u]nfortunately, this is far from the case. As Einstein put it, ‘information is not knowledge’ and although citizens are bombarded by information on a constant basis, this bombardment does not necessarily bring about knowledge” [19]. The essential facilitator turning information into knowledge for policy-making and beyond is not only open access to statistics and data, it, more importantly, is statistical and data literacy.
As outlined above and in a previous contribution to this journal [9], the use of statistics and data in policy-making is an essential means of doing politics and an indispensable ingredient of evidence-informed policy. It is a political instrument that can support the emergence and/or preservation of power. In a metrological and post-metrological way, it impacts both the social construction of knowledge that is regarded as relevant in and for politics (in contrast to experience) and on governance processes that rely on statistics and data as sources of factual evidence. Both effects are of relevance when looking at why statistical and data literacy in policy-making is important and specific. They show that policy-making requires particular capacities to understand statistics and data in context and certain ‘meta-skills’ to identify the implications of their use in policy-making.
Looking at the knowledge effects of statistics and data in politics, those engaged in policy-making need to be able to identify the side effects of using statistics and data. They need to be sensitive to the fact that statistics and data, as quantitative approximations to reality, represent knowledge constructs themselves (‘knowing through data’) and that these constructs are used to prioritise for the public good instead of merely representing ‘neutral facts’. Therefore, the selection of statistics and data for policy-making requires ethical reflections and justification. Actors engaged in policy-making need to understand that, apart from their descriptive (‘observing through data’) and diagnostic power (‘scrutiny by data’), statistics and data also possess normative power (‘stipulating by data’) in politics. They also need to understand that, within policy-making, statistics and data support political deliberation (‘arguing through data’), motivate and justify political prioritisation (‘agenda-setting by data’) and can serve as dominant political knowledge (‘framing through data’). They need to understand that the use of statistics and data can be influenced by political, strategic and value-based considerations that champion adequate rather than best available quantitative evidence (‘proving through data’). Sensitivity to potential feedback loops between the production of statistics and data and policy development is especially important to understand that quantification is political (‘data for policy’, ‘policy from data) and that it, therefore, needs to adhere to ethical principles and quality standards. This overall awareness needs to be coupled with the insight that statistics and data can define what they are meant to describe instead of ‘only’ measuring a certain reality (‘understanding data’) and that oversimplification and difficulties to develop common understandings of complex social phenomena impact data collection and selection. These issues demand interlinkages between and responsiveness of statistics’ production and the policy context, which data shall inform [9, 20]. Actors engaged in policy-making need to acquire the skills and capacity to work critically with these new forms of knowledge. Within this special stream, Schield offers seven questions, with which those engaged in policy-making can scrutinise these usages of statistics and data to better understand their potential knowledge effects.
In terms of governance effects, actors engaged in policy-making require skills to analyse how statistics and data are used by whom for what purpose in what type of policy-making contexts (‘data as evidence’). They need to be able to understand that statistics- and data-based processes impact governance (‘governing by data’, ‘governance of data’), affect who participates in policy-making and that statistics and data directly take over different governance functions (‘collective action through data’, ‘steering by data’, ‘ruling by data’, ‘controlling with data’, ‘advocacy through data’). Moreover, they need to be sensitive to the power dimension and impact of statistics and data in interest-formation, preference-building (‘community-building through data’) and negotiations (‘(ab)using data’) and to the fact that their use is interest-driven and context-dependent (‘data in evidence arenas’). Therefore, special attention is required to the fact that statistics and data have become a target of instrumentalisation and politicisation which can endanger democratic decision-making. As analysed previously in this journal [9], Greece, Tanzania, and the World Banks’ Doing Business indicators are negative examples and cases in point here. Within this special stream, Radermacher offers an in-depth insight into such governance effects and into the desideratum of an overall data culture to emerge, while Sabbati shows how these effects can be mitigated within statistics-based knowledge products to inform policy-making and decision-makers.
The above considerations show that statistical and data literacy for policy-making goes beyond statistical-numerical-technical knowledge, skills and capacities. It is as much about understanding the context in and purpose for which statistics and data are used as it is about accepting the limitations of statistics and data. It requires a particular transparency-of-evidence-commitment and ethical standards to enhance the legitimacy of political decision-making instead of jeopardising it. Acknowledging this demand, institutional practice across levels of governance increasingly underlines the relevance of statistical and data literacy in and for evidence-informed policy-making. Within the European context, for instance, the Joint Research Centre of the European Commission’s 2017 skills map for evidence-informed policy-making [21] explicitly refers to statistical and data literacy under various categories: “Communicating Scientific Knowledge”; “Evidence gap mapping”; “Digital data visualisation”; and “Infographic design” [21]. As specific requirements, the map formulates “Basic skills for interpretation of quantitative and qualitative data” and “Improved understanding of statistical data in various formats and visualisations” [21].
Given the effects and perils of the use of statistics and data in policy-making, the responsibility delegated to actors engaged in politics by their electorates or institutions and the political power exerted by collective political action, statistical and data literacy is an essential prerequisite for 21st-century policy-making. Political actors need to be statistically and data literate to be able to use statistics and data for the public good. They need to understand the potential of and sensitivities arising from the use of statistics and data and they need to have the skills to identify their misuse.
4.What to find in the special stream?
The four contributions to this special stream describe and analyse essential skills required for the use of statistics and data policy-making as well as potential pitfalls that accompany it. Some contributions also capture the relevance of statistics and data for policy-making and their impact on politics. All contributions highlight the relevance of interests, preferences, motives, values, ethics and goals for the use of statistics and data in policy-making which opens the discussion to cognitive, affective, and socio-emotional competencies as well as interests and preferences. As a result, all contributions underline the importance of a multi-dimensional skills matrix in which several types of literacy are required to understand and use statistics and data properly. The relevant skills depend on an actor’s function and position within the political process. Therefore, all contributions share a focus on context in as far as they underline the importance of understanding statistics, data and their use in a politically, economically, socially and culturally embedded way. Both the production and use of statistics and data depend on societal demand, social and technical standards, situational circumstances, political practice and individual capacities.
Walter Radermacher analyses “Statistical Awareness Promoting a Data Culture” [22]. Within his contribution, he characterises different existing data cultures; describes value creation from data to knowledge; discusses the appearance of different ‘data worlds’ (e.g. big data, official statistics, academic data); and introduces the term ‘data culture’ to capture a particular social and political climate. In this climate he identifies “a new threat
With this assessment he points to rising difficulties of ‘understanding data’, emerging trends of ‘stipulating by data’ and the blurring lines between data producers and data users. He, moreover, maintains that the proliferation of data and data sources has triggered the production and use of alternative forms of (non-official) data and the emergence of “an approach that relies significantly on decentralised competences and processes” [22] that both render literacy concerns even more pertinent. Identifying such negative potential of datafication of societies, he underlines that statistics within democratic policy-making “help to base decisions on factual arguments, .. simplify conflict resolution by eliminating the need to argue about some issues (thus creating space for opinions, valuations and decisions informed by those facts)” [22]. To be able to exert this positive impact, those engaged in policy-making (as well as society in general) need to dispose of sufficient “[k]nowledge about data and knowledge from data” [22], i.e. ‘understanding data’, and the necessary skills to understand the opportunities, benefits, risks, and limitations that characterise statistics. This means, that users of statistics should have “a reasonable understanding of what information product they are being provided with, what its quality profile is, what they can (or cannot) use it for, etc.” [22]. According to Radermacher’s assessment, missing statistical and data literacy can distort policy-making, as “[o]verestimation leads to exaggerated expectations and disappointments, underestimation to missed opportunities, risks on both sides” [22]. In view of the instrumentalisation of statistics and data, he identifies even bigger threats resulting from misuses of statistics and immense socio-political damage “if facts are influenced or manipulated with political intentions or if even the impression of arbitrariness is created with so-called ‘alternative facts”’ [22]. While acknowledging the relevance of statistical and data literacy in policy-making, he, therefore, highlights that potential misuse can also result from strategic reflections and political interests and hence be intentional rather than accidental. He suggests that “it can be argued that the virus of false and manipulated information flourishes when the statistical literacy of the population is at a low level” [22] and that “an improvement in statistical literacy would be very good for politics, both on the part of the population and on the part of politics itself” [22]. We learn from his contribution that “the aim must be to promote and nurture a culture in which a conscious and experienced approach regarding data and statistics has become the standard” [22]. Within his contribution, he advocates for the establishment of such a data culture that relates to governance, institutions and trust in support of ‘community-building through data’, ‘governance of data’, and ‘governing by data’. He moreover focuses on the role of official statistics and new data sources in politics and for democracy; value creation from data; basic statistical and data literacy skills required in policy-making; policy-makers dealings with changes in informational ecosystems and emerging data logics; and the impact of evidence-informed policy-making and missing statistical literacy on data use in politics.
Giulio Sabbati offers a practitioner’s guide to “Statistical and Data Literacy: A Practitioner’s View on Policy-making: How to provide independent, objective and authoritative data and information for policy-making” [23]. Asserting that “the use of statistics has become a political power resource, and access to and understanding of data is becoming more and more important” [23], he reflects on how statisticians and data scientists can “make data talk to a broad group of people, and ensure they are properly understood” [23] in policy-making. In his reflections, he concentrates on the role data scientists “working with or for policy-makers, should play” [23] and “gives a practitioner’s view on data literacy for policy-making
Milo Schield proposes “Statistical Literacy: Seven Simple Questions for Policymakers” [24] and offers a complementary contribution to Sabbati’s suggestion that “[a]sking questions and finding answers about data gives you knowledge of the data itself; it is actually ethical to do that, and ultimately will help you gain trust” [23]. Schield emphasises that statistics are numbers in context and that policy-makers need to be aware of this contextual dimension and the social construction of statistics and data when using them in policy-making. For an informed use, he states that “[i]nformation literacy, data literacy and statistical literacy overlap when they deal with data as evidence in arguments” [24] and suggests that all three forms of literacy are “new areas for many policymakers” [24]. He underlines that analysis and evaluation are common practices that policy-makers are used to, but that statistical data seems to constitute a special case in policy-making. “Most policy-makers are used to facing new situations: situations for which they have little prior experience. To handle such situations, they ask questions. This is how policy makers get the information they need. Unfortunately some policymakers are not used to asking questions about statistics. In order to ask good questions policy makers need to know something about statistics” [24]. An essential component of the necessary knowledge about statistics is to be able to differentiate statistics from numbers. “Numbers are more like book-keeping: arithmetic operations that don’t involve assumptions or choices. Statistics are different – very different. Statistics deal with reality. It is easier to lie, to mislead or to prevaricate with statistics. Statistics – certainly social statistics – involve assumptions and choices” [24]. To effectively use statistics and data as quantitative evidence, those engaged in policy-making thus “need to untangle social statistics from arithmetic numbers” [24]. He also recommends getting an understanding of the interests, motives, values, and goals for generating, selecting, presenting and using statistics. Apart from ‘understanding data’, these suggestions reflect elements of ‘framing through data’, ‘arguing through data’, ‘stipulating by data’, and ‘(ab)using data’ as discussed above. To enhance statistical and data literacy in policy-making, he suggests seven essential questions in order to assess the quality, information content and intention of the use of statistics and data: (1) How big? How much? How many? (2) Compared to what? (3) Why not a rate? (4) Per what? (5) How were things defined, counted, and measured? (6) What was taken into account? Is this a crude association? (7) What should have been taken into account? These questions identify seven basic elements that can help policy-makers increase their understanding of statistics and data. Schield’s additional advice is that “[t]he main thing is for policymakers to treat statistics the same way they treat people. People have motives, values and agendas. So do statistics – because they were selected, assembled and presented by people who have motives, values and agendas. Statistics are closer to words than to numbers. Yes, statistics involve numbers, but statistics are numbers in context and the words give the context” [24]. We learn from his contribution that policy-makers “need to evaluate quantitative evidence using the same skills they use in evaluating other evidence. Ask questions!” [24] Within his contribution, he points at the relevant aspects of data collection, analysis, and interpretation relevant for policy-makers to understand statistics and data.
Katharina Schüller extends the analytical perspective to “Data and AI Literacy for Everyone” [25]. Her contribution helps understand various aspects and layers of data and AI literacy required, among others, for ‘governing by data’, i.e. taking decision based on data, and for “deal[ing] with data in a conscious and ethically sound manner” [25]. She highlights essential competences for digitisation and points at the problematic delineation between different forms of literacy. Here, she underlines that “[s]tatistical literacy, data literacy, information literacy, and AI literacy are terms that are often mentioned as essential competencies in relation to digitization” [25], even though “there is the danger of definitions being deliberately extended because ‘data’ and ‘AI’ are becoming more and more fashionable” [25]. Her analysis helps us to understand various aspects of ‘understanding data’ in terms of literacy requirements of society as a whole when it comes to modern information technologies and digital competences. She defines data and artificial intelligence (AI) literacy as a future skill and new element of ‘governing by data’ that goes beyond a set of technical skills in order “to promote autonomy in a modern world shaped by data and its application as well as new technologies like AI
Notes
2 For the purpose of this special stream and to reflect the practice of political decision-making, the terms ‘data’ and ‘statistics’ are used together to include the multiverse of metrological instruments that are used as quantitative evidence in policy-making. Such instruments include official statistics on the one hand and indicators, indices, composite indicators, and/or scoreboards prepared by other actors than NSO on the other.
Acknowledgments
The contributions to the special stream originate from the ISI World Statistics Congress 2021 Invited Paper Session on ‘Statistical and Data Literacy in Policy-Making’. The session was co-organised together with the Director of the IASE’s International Statistical Literacy Project (ISLP), Reija Helenius, to whom we are extremely grateful for linking our activities to the ISLP.
References
[1] | Baron J. A Brief History of Evidence-Based Policy. The ANNALS of the American Academy of Political and Social Science. (2018) ; 678: (1): 40-50. doi: 10.1177/0002716218763128. |
[2] | Cairney P. The politics of evidence-based policy making. Basingstoke: Palgrave MacMillan, (2016) . |
[3] | Head BW. Toward More “Evidence-Informed” Policy Making? Public Administration Review. (2015) ; 76: (3): 472-84. doi: 10.1111/puar.12475. |
[4] | Greenhalgh T, Russell J. Evidence-based policymaking: a critique. Perspectives in Biology and Medicine. (2009) ; 52: (2): 304-318. doi: 10.1353/pbm.0.0085. |
[5] | OECD. Building Capacity for Evidence-Informed Policy-Making: Lessons from Country Experiences. Paris: OECD Publishing, (2020) . |
[6] | Parkhurst J. The politics of evidence: from evidence-based policy to the good governance of evidence. Taylor & Francis, (2017) . |
[7] | Solesbury W. Evidence based policy: Whence it came and where it’s going. London: ESRC UK Centre for Evidence Based Policy and Practice, (2001) . |
[8] | United Nations Department of Economic and Social Affairs. Areas of Work: Evidence-based Policy |
[9] | Umbach G. Of Numbers, Narratives and Challenges: Data as Evidence in 21st Century Policy-Making, Special Feature on Governing by the Numbers – Statistical Governance. Statistical Journal of the IAOS (2020) ; 36: (4): 1043-1055, doi: 10.3233/SJI-200735. |
[10] | Schüller K. Future Skills: a Framework for Data Literacy: Competence Framework and Research Report, Hochschulforum Digitalisierung. Working Paper 53, (2020) . |
[11] | Carmi E, Simeon JY, Lockley E, Pawluczuk A. Data Citizenship: Rethinking Data Literacy in the Age of Disinformation, Misinformation, and Malinformation’. Internet Policy Review. (2020) ; 9: (2): 1-22. doi: 10.14763/2020.2.1481. |
[12] | Gould R. Data Literacy is Statistical Literacy. Statistics Education Research Journal. (2022) ; 16: (1): 22-25, doi: 10.52041/serj.v16i1.209. |
[13] | Engel, J. Statistical Literacy for Active Citizenship: A Call for Data Science Education. Statistics Education Research Journal. (2022) ; 16: (1): 44-49. doi: 10.52041/serj.v16i1.213. |
[14] | Sharma S. Definitions and models of statistical literacy: a literature review. Open Review of Educational Research. (2017) ; 4: (1): 118-133, doi: 10.1080/23265507.2017.1354313. |
[15] | Koga S. Characteristics of statistical literacy skills from the perspective of critical thinking. Teaching Statistics. (2022) (early view); 1-9. doi: 10.1111/test.12302. |
[16] | Frank M, Walker J. Some key challenges for data literacy. The Journal of Community Informatics. (2020) ; 12: (3): 232-235. doi: 10.15353/joci.v12i3.3288. |
[17] | Schield M. (2011) . Statistical literacy: a new mission for data producers. Statistical Journal of the IAOS. (2011) ; 27: (3,4): 173-83. doi: 10.3233/SJI-2011-0732. |
[18] | Gal I. Adults’ statistical literacy: Meanings, components, responsibilities. International Statistical Review. (2002) ; 70: (1): 1-25. doi: 10.1111/j.1751-5823.2002.tb00336.x. |
[19] | Giovannini E. Statistics and politics in a ‘knowledge society’. Social Indicators Research. (2008) ; 86: (2): 177-200. doi: 10.1007/sll205-007-9137-z. |
[20] | Malito DV, Umbach G, Bhuta N. The Palgrave Handbook of Indicators in Global Governance. Palgrave Macmillan, (2018) . |
[21] | Joint Research Centre. Skills for Evidence-Informed Policy Making: Continuous Professional Development Framework’. Brussels: European Commission, Joint Research Centre; (2017) ; https//ec.europa.eu/jrc/communities/sites/default/files/10_2017_ec_jrc_skills_map_evidence-informed_policymaking_final.pdf. |
[22] | Radermacher W. Statistical Awareness Promoting a Data Culture, Special Stream on “Statistical and Data Literacy in Policy-Making”. Statistical Journal of the IAOS. (2022) ; 38: (2). |
[23] | Sabbati G. Statistical and Data Literacy: A Practitioner’s View on Policy-making: How to provide independent, objective and authoritative data and information for policy-making, Special Stream on “Statistical and Data Literacy in Policy-Making”. Statistical Journal of the IAOS. (2022) ; 38: (2). |
[24] | Schield M. Statistical Literacy: Seven Simple Questions for Policymakers, Special Stream on “Statistical and Data Literacy in Policy-Making”. Statistical Journal of the IAOS. (2022) ; 38: (2). |
[25] | Schüller K. Data and AI Literacy for Everyone, Special Stream on “Statistical and Data Literacy in Policy-Making”. Statistical Journal of the IAOS. (2022) ; 38: (2). |