New developments in central bank statistics around the world1
Abstract
A key lesson from central banks’ experience during the COVID-19 pandemic, as both users and producers of economic and financial data, is the need to broaden their ability to face future shocks that can test the resilience of today’s economies in unexpected ways. This could be achieved by developing higher-frequency, more granular and timelier indicators, leveraging on the growing availability of alternative data sources. In particular, increased digitalization is bringing new types of information that can complement and expand traditional analysis and statistical measurements. Yet, a key issue is that reaping the full the benefits of such new and alternative data sources can face several important challenges.
1.Introduction
Central banks have an almost unique perspective on official statistics, being at the forefront of both the production and the use of economic and financial data. On the one hand, they are formally tasked in a large number of countries to produce statistics on various domains, especially on the financial system, that are of key relevance for a large range of economic policymakers. On the other hand, central banks are also in charge of conducting specific policies and have in particular to make an extensive use of diverse data sources in pursuing their monetary policy and financial stability goals. Both roles demand constant attention to the evolution of the economic and financial environment and to the suitability of existing statistics and analytical tools and products to describe it faithfully.
One important challenge is that this environment is constantly evolving, requiring a continuous adaptation of the official statistical framework. Moreover, while available statistical products and methods are designed to describe what is known to be relevant to decision makers, this knowledge is not fixed in time, since new policy issues constantly emerge. These discontinuities in both the supply of and the demand for statistics can be quite substantial, especially when large and unusual shocks occur that tend to reveal the presence of “dark corners” in economic and financial information.
The vulnerabilities triggering the Great Financial Crisis (GFC) of 2007–09, for example, were initially almost unnoticed by policy makers because of the lack of suitable statistics. However, through swift and globally coordinated action, the most relevant data gaps were singled out and adequate action plans were rapidly designed to address them, especially in the context of the Data Gaps Initiative (DGI) endorsed by the G20 [9]. More than one decade has passed since the GFC, and extensive work has been indeed achieved since to close the most pressing data gaps in key areas and strengthen the ability to monitor global economic financial developments. These improvements were clearly evident when the COVID-19 pandemic struck: policy makers had at their disposal a wealth of better-quality, more comprehensive, flexible and integrated statistics that would have been barely available a few years ago [14]. The potential of this new information to monitor risks in the financial and non-financial sector as well as to analyse interconnectedness and cross-border spill overs was in particular underlined during the turmoil observed in financial markets in March 2020 when CV19 escalated [8].
Yet the pandemic has also taught new lessons. One is the sheer speed of the developments on the ground in crisis times: it underlines the importance of having high frequency, well-documented and more timely indicators to support evidence-based policy. This calls for statistical frameworks to become more flexible and granular so as to address evolving users’ needs and help better monitor fragilities especially in periods of crisis [6]. Another lesson is that the (unexpected) nature of the shock has clearly expanded the range of phenomena and hence statistics that central banks must look at to pursue their tasks. Going forward, the unpredictability of the data needs that can arise when a shock hits the economy calls for being prepared to set up adequate instruments and arrangements that allow measuring what is relevant when it becomes relevant. A third lesson is that the disruptions caused to the traditional statistical production process, for example due to the suspension of important surveys, have highlighted the need for looking at less conventional and still untapped sources of “alternative” information [3]. These sources can be essential to assess the resilience of today’s economies, for instance if they help to measure phenomena that are not well captured by “standard” statistics, and may (though not necessarily) have to be integrated to the official statistical supply.
Reflecting the importance of these issues for central banks, the Irving Fisher Committee on Central Bank Statistics (IFC), an affiliated member of the International Statistical Institute (ISI), devoted a specific session on “New developments in central bank statistics around the world” on the occasion of the 63
From this perspective, a key message from central banks’ experience is the need to broaden their ability to face future shocks that, like COVID-19, can test the resilience of our economies in unexpected ways. This could be achieved by developing higher-frequency, more granular and timelier indicators, leveraging on the growing availability of alternative data sources. In particular, the increased digitalization of today’s societies are bringing new types of information that can complement and expand traditional analysis and statistical measurements. Yet, a key issue is that reaping the full the benefits of such new and alternative data sources can face several important challenges.
2.Lessons to be learned from COVID-19
A first important lesson for producers of official statistics is the need for more timely information. The official measurement of real economic activity offered by usual GDP statistics only comes at quarterly frequency (at most in advanced economies, while a large number of developing countries still rely on annual figures) and often with substantial delay with respect to the period of interest. Policy makers have thus to rely on tracking other types of qualitative and quantitative indicators to gather realtime signals. Hopefully, a number of statistical techniques have been developed over time to extract timely and reliable signals about economic activity in advance of GDP releases. Such “nowcasting methods” became all the more relevant during the pandemic, as the economic situation evolved at unprecedented speed for quite some time and along dimensions unseen before. For instance, Ginker and Suhoy [11] apply this kind of methods to the Israeli economy to develop a monthly index of economic activity. They use a so-called “collapsed dynamic factor model” that first synthetises the main signals embedded in the available monthly series, extracts a limited number of summary factors and then jointly models these factors to obtain a nowcast estimate of quarterly GDP growth.
To be reliable, the implementation of these techniques requires having at disposal a relatively large number of high frequency series. This can be a problem for small economies, where the supply of such indicators are often hampered by the resources available. Yet even large economies may not have a sufficient number of suitable high frequency indicators, for instance because of their limited time length. Moreover, these indicators can also be subject to large disruptions, as was seen in the early stages of the pandemic: there was a pressing need to monitor economic developments on a real time basis but the collection of “traditional” high frequency indicators such as economic surveys was hindered by lockdown restrictions [4].
A telling example of these difficulties was related to the measurement of inflation, one of the most relevant gauge for conducting monetary policy. Indeed, the compilation of consumer price indices (CPIs) during the pandemic suffered from the disruption of the data collection process (reflecting eg the closure of bricks-and-mortar stores) as well as from large swings in consumer behaviour, caused both by the shutdown of entire sectors of activity (eg restaurants) and by dramatic changes in spending preferences (eg less appetite for travelling). These factors posed important challenges to the measurement of inflation dynamics, and hence to the design of adequate policy measures to weather the shock. A key reason is that traditional inflation measures rely on an annual updating of the weights of the consumer basket, which arguably became quickly outdated as the pandemic struck [5, 23].
To address these challenges, Kouvavas et al. [18] have developed an experimental index to measure inflation during the 2020 pandemic. Using retail and services turnover data, they were able to calculate CPIs based on monthly-updated weights so as to take into account high-frequency pandemic-related shifts in consumption patterns. Their estimates show that measured inflation for the euro area would have been slightly higher in 2020 compared to the headline indicator (by around 0.2 pp), and lower in 2021 (reflecting the reversal of pandemic-related spending disruptions following the gradual normalisation of the situation and the lifting of lockdowns). However, and similarly to the challenges referred above as regards GDP nowcasting exercises, one difficulty was to have adequate and timely data sources to track rapid changes in consumption weights; this difficulty was reinforced by the disturbances in several data collection processes at the height of the pandemic.
The examples above underline the sheer vulnerability of existing statistical production and policy design processes to potential shocks like COVID-19, precisely at times when the smooth functioning of these processes is the most needed. One silver lining, however, is that the recent crisis showed that there is ample room to mitigate these difficulties by tapping on alternative data sources, especially those that were less subject to the disruptions associated with the pandemic. Indeed, the data generated by digital activities were not disrupted by stay-at-home orders and were still able to provide a realistic and timely picture of what was going on, especially in comparison to official statistics that were available on a less timely and frequent basis and/or that were affected by production troubles. For example, data from online retail trade platforms as well as from providers of payment services continued to be available, allowing for real time monitoring of spending patterns and prices. Similarly, a number of alternative sources such as smart meters (eg electronic devices recording electric consumption), mobility trends derived from smartphone location data, or even air pollution data were used during the pandemic as complementary sources to support high-frequency measurements of real economic activity [7, 21].
Moreover, in addition to providing “hard” alternative data on relevant economic phenomena, digitalisation has also expanded the possibility of considering other types of indicators. This can be the case for “soft” factors, like indicators of confidence among economic agents, which can play an important role in shaping and predicting economic dynamics, even though they may not be on the “traditional” radar screen of statisticians [1]. For instance, Armas and Tuazon [2] have adopted this kind of approach to use freely available data on internet searches to assess investors’ sentiment amid the pandemic and study the response of financial market prices to changes in risk attitudes. They find that this “soft” information could be statistically significant to support the monitoring of daily stock market developments in Asian markets and potentially enhance the design of policy making processes. More generally, social media have become important sources of information providing real-time insights on the behaviour and sentiment of the general public. This information can increasingly be extracted with the development of powerful “big data analytics” (eg text-based analysis, machine learning (ML) tools) that allow to decipher the signals collected and turn them into statistical inputs complementing more traditional indicators to support policy.
3.Challenges looking forward
Crises are often an occasion to learn. The COVID-19 pandemic, just like the GFC a decade earlier, unveiled lingering deficiencies as well as newly identified weaknesses in the traditional statistical apparatus. It thus represents an occasion to reflect on how to address such shortcomings, in particular by identifying and filling relevant data gaps and reorganizing processes so that the infrastructure underpinning official statistics can be enhanced and better prepared to potential new crises.
As analysed above, one major lesson of the impact of the pandemic is that the disruptions caused to data collection exercises and the need to better monitor the swift changes that occurred in agents’ preferences and behaviours have required developing innovative strategies to make up for the unavailability (or limited informativeness) of a number of traditional statistical sources. These often involved relying on new and/or unconventional data sources, reflecting several developments: the large data sets produced as a by-product of private actors’ operations, the wealth of administrative registers maintained by public agencies for long without being used for statistical purposes, the footprints of increased digitalisation that have emerged in many parts of modern life, and the greater ability to process unstructured data sets such as text with the new techniques available. These varied “alternative” sources proved particularly helpful to statisticians and policy makers during the pandemic, by complementing conventional data sources (or making up for them) in face of compilation disruptions, providing more timely and/or frequent signals when needed, and offering new insights on phenomena that were not well captured by traditional indicators. For instance, the COVID-19 pandemic has emphasised the need to enhance the measurement of environmental topics (e.g. climate change) and socioeconomic factors (e.g. distributional aspects and inequalities as well as financial inclusion issues) and that could be addressed in the next phase of the DGI being contemplated by the G20 after 2021 [10].
Yet there were also a number of challenges involved in accessing such alternative data sources, as indeed recognised by a large majority of central banks [13].
First, although these sources are pervasive and provide an increasing amount of information on various aspects of economic and financial activities, their systematic usage for statistical and policy purposes requires an adequate degree of stability. They need to be both available and accessible for long in order to justify the needed methodological and technical investments for exploiting them. However, the continuity and consistency over time of the output generated based on new alternative sources is not always guaranteed. For instance, methods developed in stressed times may not work well under other, “more normal” periods of the economic cycle [17]. Moreover, a certain amount of experience is needed to judge the true quality of the new indicators being developed. The failure of Google Flu Trends provides a good example of these perils, as it was initially intended to provide estimates of influenza activity based on Google Search queries but was discontinued in the mid-2010s [20]. Furthermore, one cannot be fully reassured about the longevity of newly developed data sets: they may not pass the test of time if economic agents change their habits and hence their digital footprints. For instance, to which extent should policy makers rely on analysing messages collected by social medias, given that the public usage of these medias may well change fundamentally (perhaps disappear?) in the future?
Second, the apparent comprehensiveness of new “alternative” data hides at least two drawbacks that must be properly addressed from a statistical methodological perspective, before they can be safely put to use. On the one hand, digital data are often generated by “digitalised” agents and activities. This can lead to substantial composition biases, which are hard to assess and may well increase over time. For example, social media content is generated by the subpopulation that actively participates into these exchanges. Similarly, web searches are generated only by the subpopulation interested in the specific topic and able to access the internet. As such, these data would almost never be representative of the whole population of policy interest (not everybody is on Twitter). Hence they can embed significant selection bias which must be properly understood, if not addressed, so that they can be reliably used [22]. On the other hand, the sheer novelty of new alternative data sets often means that the “meaning” of the information presented is unclear and requires additional efforts to be understood fully. For example, the simple measuring of the number of clicks made for specific web searches does not provide information on why these searches were undertaken. These difficulties are reinforced by the velocity and high frequency of alternative sources of information, with data users confronted with often unfavorable signal-to-noise ratios [19].
Third, the information content of the data sources ultimately depends on their intended use. Alternative statistics can be used as benchmarks to forecast official statistics, for instance in the case of GDP nowcasts. Yet one may also wish to leverage on observed underlying correlations to infer conclusions supporting policy recommendations. This puts a premium on ensuring transparency in the sources used for such inference purposes, given the risk of reaching false conclusions in the presence of unobserved confounding factors. Hence a key public policy issue is how alternative information sources (being private commercial data sets or public registers that were not initially set up for a statistical purpose) and the data producers located outside of the national statistical systems feature vis-à-vis the Fundamental Principles that have been defined to support the quality of official statistics [24]. Given that many of the new data sets have a global nature, this calls for strengthening their governance at the international level, with a broad focus so as to cover the entire production and use of statistics, including alternative sources [15].
Fourth, there are important challenges posed by integrating new types of data in the infrastructure supporting the production official statistics. Coping with an avalanche of data of various formats requires adequate IT, skills, and budget. It also calls for having adequate registers, identifiers and aggregation rules so as to transform granular data points into meaningful macroeconomic aggregates. Last but not least, clear data sharing agreement and standards (e.g. SDMX [12]) are necessary to be able to mobilise various data sources in a coherent way.
4.Conclusion
Central banks’ experience has underlined the potential of alternative data sets to make available high frequency, more timely, flexible and granular statistics that are clearly in demand to follow macroeconomic developments and support policy. In particular, the new, unconventional sources of information that have emerged with the digitalisation of our societies show a lot of promise. They can cover many realms of the economic and financial sphere that still are difficult to capture through more traditional data collections. And they are potentially available in nearly real time, facilitating the conduct of economic policy especially in the face of unexpected shocks.
Yet these new data sources can come with huge numbers, multiple formats and high noise-to-signal ratios making their systematic employment in policy making and statistical production a difficult task. Some of these challenges might be addressed with appropriate engagement rules between public agencies and private data providers; others require further adequate improvement in our statistical and analytical methodological work.
Meeting all these challenges will make the statistical and policymaking communities more sure-footed. Importantly, one needs to realise that what may be considered as an information gap at first sight does not necessarily reflect a lack of relevant data, but rather a failure to transform existing indicators into actionable knowledge. This happens even more so in today’s information society that is well beyond its infancy: multiple and various data are constantly generated, collected and stored by public and private actors in pursuing their idiosyncratic endeavours. It means that perceived information gaps do not necessarily require new reporting exercises, as they can arguably be filled if statisticians and policy makers have the possibility/power to quickly tap into existing data that could be turned into salient information, for instance to get timelier/higher frequency measures of common phenomena or to cover new, unexplored statistical domains.
References
[1] | Aguilar P, Ghirelli C, Pacce M, Urtasun A. Can news help measure economic sentiment? An application in COVID-19 times, Bank of Spain Documentos de Trabajo, no 2027, (2020) . |
[2] | Armas JC, Tuazon P. Revealing investors’ sentiment amid COVID-19: the Big Data evidence based on internet searches, IFC Bulletin, no 55, (2021) . |
[3] | Biancotti C, Rosolia A, Veronese G, Kirchner R, Mouriaux F. COVID-19 and official statistics: a wakeup call? Bank of Italy Occasional Paper, No 605, April (2021) . |
[4] | Bidarbakhtnia A. Surveys under Lockdown; a pandemic lesson, United Nations Economic and Social Commission for Asia and the Pacific (ESCAP), Stats Brief, no 23, April (2020) . |
[5] | Cavallo A. Inflation with Covid consumption baskets, NBER Working Paper, no 27352, (2020) . |
[6] | De Beer B, Tissot B. Implications of COVID-19 for official statistics: a central banking perspective, IFC Working Papers, no 20, November (2020) . |
[7] | Deutsche Bundesbank: A weekly activity index for the German economy, Monthly Report, May (2020) , pp. 68-70. |
[8] | Financial Stability Board: Holistic Review of the March Market Turmoil and COVID-19 pandemic: Financial stability impact and policy responses, November (2020) . |
[9] | Financial Stability Board and International Monetary Fund: The financial crisis and information gaps, (2009) . |
[10] | G20 Italian Presidency: Communiqué of the Second G20 Finance Ministers and Central Bank Governors meeting, 7 April (2021) . |
[11] | Ginker T, Suhoy T. Nowcasting and monitoring Israeli real economic activity, IFC Bulletin, no 55, (2021) . |
[12] | IFC: Central banks’ use of the SDMX standard, IFC Report, no 4, (2016) . |
[13] | IFC: Use of big data sources and applications at central banks, IFC Report, no 13, (2021) . |
[14] | IFC: Micro data for the macro world, IFC Bulletin, no 53, (2021) . |
[15] | IFC: Issues in Data Governance, IFC Bulletin, no 54, (2021) . |
[16] | IFC: New developments in central bank statistics around the world, IFC Bulletin, no 55, (2021) . |
[17] | INSEE: “High-frequency” data are especially useful for economic forecasting in periods of devastating crisis, Point de Conjoncture, June (2020) , pp. 29-34. |
[18] | Kouvavas O, Rollo C, Trezzi R. An experimental index to measure inflation in the pandemic, IFC Bulletin, no 55, (2021) . |
[19] | Lane P. Data analysis and monetary policy during the pandemic, presentation at the Central Bank of Ireland webinar, 7 October (2021) . |
[20] | Lazer D, Kennedy R, King G, Vespignani A. The parable of Google Flu: traps in big data analysis, Science. (2014) ; 343: (6176): 1203-1205. |
[21] | Lewis D, Mertens K, Stock J. Monitoring Real Activity in Real Time: The Weekly Economic Index, Federal Reserve Bank of New York Liberty Street Economics, 30 March (2020) . |
[22] | Mehrhoff J. Demystifying big data in official statistics – it’s not rocket science!, IFC Bulletin, no 49, (2020) . |
[23] | Surico P, Hacioglu H, Känzig D. Consumption in the time of COVID-19: evidence from UK transaction data, CEPR Discussion Papers, no 14733, (2020) . |
[24] | United Nations (UN): Fundamental Principles of Official Statistics, resolution adopted by the Economic and Social Council, E/RES/2013/21, 28 October (2013) . |