The problem of balancing economic accounts has been recognized for a long time. In 1942, Richard Stone et al. proposed a weighted least squares approach (hereafter SCM approach) to balance small economic accounts. This approach has been extended to accommodate reconciliation of large-scale national accounts (NA) systems. The main challenge turned out to be the estimates of the uncertainties of initial NA aggregates. In this study, we try the SCM approach for automatically balancing a large-scale supply-use framework in the Swedish NA. Efforts are made to estimate the uncertainties not only from sampling errors but also from non-sampling errors. The error estimates are used as weights in the balancing procedure. The approach is evaluated through a test run in parallel with a real compilation of the Swedish annual NA. Our study shows that the automatic balancing procedure is feasible to implement in the production environment of Statistics Sweden. Compared with the current mainly manual balancing process, the automatic procedure is faster, cheaper and requires less time from the NA experts. Above all, the method is transparent and new information can easily be accounted for.
The uncertainty associated with the national accounts (NA), such as the gross domestic product (GDP), is of great interest for decision-makers, researchers, and the public. This information is nevertheless often absent in statistical releases. This is partly due to the complexity of the compilation of NA that uses a large number of data sources. It is difficult to estimate all uncertainties of the initial estimates. In a report , the data sources and possible error sources of NA compilation are well documented. The difficulties related to identifying and quantifying the errors, and in particular, the non-sampling errors are discussed. In paragraph 52 of report , it is stated, “given the current state of the art, it is not possible to calculate objective error margins for national accounts aggregates.”
Furthermore, NA must comply with the restrictions of accounting systems. There are three approaches for calculating GDP in the NA: the expenditure, the production, and the income approach. Usually, the estimates of these different approaches differ. It is therefore necessary to have a post-adjustment (“balance”) of those estimates. The balancing is usually done within the supply-use framework, i.e., the supply and use of different products CPA (Classification of Products by Activity). The balancing process is important for the compilation of NA. However, this process is not only highly demanding of expertise and time, but it is also to a large extent manual, which further complicates eventual attempts to investigate the uncertainties of the balanced NA aggregates.
In 1942, Richard Stone and others (, hereafter SCM) proposed a weighted least square (WLS) approach to balance economic accounts on a limited scale. Other authors, in particular [3, 4], formalized the approach and associated it to a Lagrange Multiplier approach with a quadratic loss function. A few applications [5, 6, 7, 8, 9, 10] have been reported. Among others [7, 8] used the SCM application to reconcile the US Industry Accounts and distribute the aggregate statistical discrepancy to industries in the Bureau of Economic Analysis (BEA). These articles show that the method is feasible and empirically efficient, although it is still difficult to obtain objective estimates to the uncertainties.
In our work, an SCM balancing approach is investigated within a supply-use (SU) framework in the Swedish NA; the income approach has not implemented completely in Sweden. The aim is to construct a framework workable in the real production environment of the NA compilation and balance. Efforts have been made to estimate the uncertainties from not only the sampling errors but also non-sampling errors. Those estimates are used then as weights in the balancing procedure. In Section 2, the framework of generalized least squares and a flexible equivalent optimization setting are described. The estimation of uncertainties of initial NA estimates is presented in Section 3. The Swedish application of this approach, along with the results, is given in Section 4. Discussion of the results, other related and future work are given in Section 5.
Following the description in [3, 8], write the estimates in NA as a vector . For the sake of simplicity of the theoretical framework, write the accounting restrictions as , where the matrix consists of known constants. Inequality and other types of constraints can be imposed (see e.g. the discussion below Eq. (2) and the reference ). Assume that is an initial, unbiased estimate of with a variance-covariance matrix , which does not satisfy the accounting restriction. For a balanced estimate , we use the quadratic loss function , and the solution by WLS approach is given by
Furthermore, under the condition that is the correct covariance of and that has full row rank, the variance-covariance of is given by
Note that the vector can be large, possibly consisting of thousands of elements. The required computing capacity required to handle this matrix size might be one reason for the limited use of the SCM approach in the past.
In practice, following [3, 7], a more workable formula under the SU framework is used in our study. Consider the following notation. For ; , with as the (known) number of industries and the (known) number of product groups, denote
: the output (gross production) of product from industry .
: the intermediate consumption of product for industry .
: the total output (over products) from industry .
: the total intermediate consumption by industry .
: the -th type of domestic final use of product (the domestic final uses include Household final consumption expenditure, Government spending, Gross fixed capital formation, Changes in inventories).
: the exports of product .
: the imports of product .
: the totals (over products) of the domestic final uses, exports, and imports, respectively.
The corresponding initial estimates are indicated with a superscript 0 and are the corresponding uncertainties (which will be defined later) of the variable (e.g. or ). The problem is therefore equivalent to minimizing
over , , , , , , , , under some constraints. In Eq. (2) is assumed to be positive for those variables to be adjusted. For variables with , the initial estimates will not be changed by convention. Note also that both individual variables by products (e.g. ) and the totals over products (e.g. ) are included as the arguments to the objective Eq. (2). They will be adjusted separately since very often they are estimated using different data sources.
The constraints of Eq. (2) include that for each industry , the total output from the industry shall be equal to the sums of outputs for all products from this industry,
Similar restrictions apply to the intermediate consumption, the domestic final uses, export and import, respectively:
The NA accounting requires further that the total supply is equal to the total use for each product (where denotes the tax rates and the trade margins),
Note that compared with the theoretical framework Eqs (1) and (2), it is assumed implicitly that there is no covariance in in the system with Eqs (2)–(2). This assumption is of course arguable, but might be plausible. In the application of [7, 8], the same assumption applies. Whilst it is not easy, if not impossible, to satisfy all the conditions required for the framework with Eqs (1) and (2), the system with Eqs (2)–(2) is very flexible. The initial estimates can be expressed in both current and constant prices. Besides restrictions Eqs (4)–(2), it is possible to impose more and other types of restrictions. It is easy to keep some variables unchanged. For instance, the coefficient in Eq. (2) is included to account for taxes (such as customs, value-added taxes, minus subsidies). These variables are assumed to be proportional to the total output for each product. Due to the complex definition and compilations the trade margin (including the third-party trading), is included in Eq. (2) for the sake of the accounting, but will not be changed.
Previous studies on the balancing of NA, in particular SU or Input-Output tables, such as  have not made use of the uncertainties of the initial estimates. In the BEA application of the US [7, 8], the SCM approach is carried out for the reconciliation and redistribution of the statistical discrepancy after a general revision of the NA estimates. In this application, the expenditure-based estimates are considered final and are not changed. Our settings aim nevertheless to create a framework workable in the real production environment of the NA compilation and balancing.
Efforts are also made in our study to estimate the uncertainties of the initial estimates , with respect to the sampling errors and the non-sampling errors as well; because nothing guarantees that sampling errors make up the main part of total errors (see ). Recall that a solution still exists for the system Eqs (2)–(2), even when one uses trivial weights in Eq. (2). Although such weights can by no means satisfy the conditions needed for Eqs (1) and (2), they are of interest as control groups in the application described in Section 4. In our study, the following four alternatives have been tested:
See Section 3 for a description of the last two alternatives.
3.The estimation of the uncertainties
3.1The sampling errors
As previously mentioned, it is a huge challenge to estimate the uncertainties of . In our work, the first effort made is to collect the sampling errors in the basis of NA figures of . It is possible due to the fact that most of the economic surveys are carried out in-house within Statistics Sweden. For example the output of a product from industry , is obtained from The Structure Business Survey (SBS) as well as its sampling error in term of its standard deviation, denoted as . After possible adjustments during the NA compilation, suppose that the estimate of becomes . In case only the sampling errors alone are used as weights in Eq. (2), the weight is
Recall that in our application where 66 and 65, there will be thousands of NA estimates from tens of different surveys whose sampling errors are needed. This work alone is a big challenge.
3.2The estimation of the total uncertainties
It is seldom in the NA compilation that the basis of NA figures (e.g. ) is taken as the NA estimate (). Adjustments are usually necessary either to correct mistakes discovered during validation and reconciliation of source data, due to conceptual differences between the basis and the NA estimates, or to ensure exhaustiveness of the estimates (see ). In recent applications such as [7, 8], attempts are made to classify inputs into predetermined categories based on data sources or expert judgment, with these categories and their associated values used to estimate uncertainties. In our study, direct expert judgment is applied to the non-sampling errors by a panel consisting of subject-matter experts, NA experts, and methodologists. Furthermore, the non-sampling errors are divided into different error sources including specification-, frame-, non-response-, measurement-, data processing-, and model-errors, following . Although the classification of the non-sampling errors can be argued, dividing the non-sampling error into different error sources should enhance a more objective judgment of the total non-sampling error. Those non-sampling errors are expressed, as for the sampling errors, as the relative standard deviation. The weights for the total uncertainties are obtained by adding the sampling and non-sampling errors (see [8, 14]) as
In Eq. (9) the summation is over all the non-sampling error sources.
3.3The estimates of the uncertainties: An example
Table 1 (Row SE) below shows our estimate of the uncertainties,, from the sampling errors in the output of all CPA product groups, from the NACE industry G46 (Industry for wholesale trade except for motor vehicles and motorcycles). The estimate is mainly from the SBS.
It can be seen that there are quite big differences between the sampling uncertainties in different CPA product groups. These vary from CPA G45T47 (Industries for Wholesale and retail trade and repair services, 1.0 percent) to CPA N80T82 (Security and investigation services; services to buildings and landscape; office administrative, office support and other business support services) which is very difficult for survey sampling with a sampling error of 29.1 percent. It is this information that we try to take advantage of in the SCM balancing approach. Some product groups have zero in their uncertainty estimates, which means that the NACE industry cannot produce those products. Such items will be omitted from the optimization expression of Eq. (2) and instead kept unaltered. Recall that the total outputs from NACE G46 () are included in Eq. (2) separately. The estimate of the sampling uncertainty in is 0.2 percent.
Analogously the total uncertainties are reported in Table 1 (Row TU). It can be seen that the relative magnitude of uncertainties among different CPA groups has been changed, which affects the results of the balancing. The total uncertainty in becomes 0.7 percent.
All the necessary estimates of the uncertainties in have been obtained. They are not only used as input to the balancing approach, but also very useful on their own. They can be utilized to judge the data quality during the manual balancing and to trade off the possible data sources whose quality has to be improved.
4.A Swedish application
4.1The initial estimates
The SCM balancing approach was tested under an SU framework parallel with the real compilation of Swedish Annual NA 2014 during April–June 2016 at Statistics Sweden. The existing balancing process is carried out for 400 product groups. The first stage consists of manual balancing and lasts for about two months. The second stage consists of a final, mechanical balancing using the RAS method  mainly applied to intermediate consumption. In our application, there are 66 industries ( 66) and 65 products ( 65) which are at the same level as Statistics Sweden released according to the Eurostat requirement. The initial estimates are obtained from several time points in the real compilation process. The discrepancy to be balanced (the total supply minus the total use) are shown in Table 2, in Millions Swedish Crown (MSEK). In Test Round 1, all major basic data into the SU table had been collected into the system, while a lot of analyses (of e.g. movements from previous years, the implied productivities, and comparisons of the Intermediate consumption and final outputs) and manual balancing had been carried out in Test Round 2. Between Test Rounds 2 and 3, further analyzing and balancing had been carried out and the discrepancies on the detailed product groups had been reduced. The actual compilation was almost finished at the end of June 2016; however minor adjustments continued to take place until September 2016 when it was released.
|1||9 May 2016||59,974 MSEK|
|2||2 June 2016||1,731 MSEK|
|3||20 June 2016||4,548 MSEK|
4.2The balanced results with discussion
Mathematical Programming Package SAS/OR is used to perform the optimization (Eqs (2)–(2)). There are around 4,600 variables and the run time of the optimization is less than one second. For the sake of space, only some summarized results are reported below and others are available upon request to the authors.
Table 3 shows the adjustments by the SCM balancing approach with different weights in Test Round 3 and the actual balancing, summarized to NA aggregates in the use side Intermediate consumption (IC), Household final consumption expenditure (HFCE), Government spending (G), Gross fixed capital formation (GFCF), Changes in inventories (CI), and Exports; and in the supply side Gross value of output (GVO), Imports, and Taxes (taxes minus subsidies).
It is known that the WLS with constant weights reduce to the ordinary least squares and that all input variables will have approximately equal adjustments regardless of their magnitude; whilst with neutral weights, they will have adjustments approximately proportional to the squares of their magnitude (Columns CW and NW in Table 3). It can also be seen that in the late stage of the actual balancing procedure, basically the adjustments are made only to the Intermediate consumption and Gross value of output, which are considered to be unreliable, based on experience and convention in the NA department at Statistics Sweden. However, it is natural and sensible that the more unreliable the initial NA estimates are, the more they should be adjusted. A close examination of the uncertainties that we estimated (not reported) shows that although the estimates of the Intermediate consumption by CPA product groups are highly uncertain, the IC totals by industries are rather reliable: this implies that the IC should not be adjusted too much. At the same time, there are high uncertainties in the estimates of HFCE and Exports (in particular of services), GFCF (in particular in manufacturing), and CI. Bigger adjustments to these estimates in the balancing may be motivated. Note that in Test Round 2, the actual balancing approach makes instead less adjustment to HFCE, CI and Exports (Table 4) than the SCM approach.
It should be noted that GDP is in fact not an aggregate compiled directly in NA, but derived from the balanced SU tables. Different balancing approaches lead consequently to different GDP estimates. The Swedish annual GDP estimate (excluding non-market products) derived from the SCM approach and the actual balancing are shown in Table 5. Observe that there is no true value for the GDP estimate and the estimates in Table 5 cannot be used directly to evaluate the balancing approaches. However seems after all too early to apply the automatic balancing at the time of Test Round 1, while those of Test Rounds 2 and 3 might be appropriate.
|Round 1||Round 2||Round 3||Actual|
5.Discussion and final remarks
Estimation of the uncertainties and balancing the double entities of NA estimates are known difficult problems in national statistical institutes (NSI). The SCM approach that we generalized from  is investigated in our study. It is shown that an automatic balancing approach is possible for the compilation of NA in NSI. With the SCM approach, the balancing is not only more objective, but also fully replicable. Furthermore, the manual balancing procedure existing today requires much more resources and expertise. However, in order to implement the SCM approach in the official statistics production, more experiments have to be carried out and evaluated since there is no obvious evaluation criterion to compare with the manual procedure.
It is a big challenge to estimate the uncertainties in the initial NA estimates. There is little work available in the literature concerning the quantification of the non-sampling errors and how to combine them with the sampling errors. In our study, the sampling errors are used as the basis and a direct expert judgment is done for the non-sampling errors. This approach may be arguable. It is nevertheless a first step to tackle this difficult problem. Recently  proposed, instead to handle the variance-covariance matrix for individual input variables, to consider accounting equations as single entities and developed scalar uncertainties measures for those entities. They showed appealing theoretical properties of this approach. It would be of great interest to follow applications of this approach in the NA.
The balancing investigated in our study is only for one period. It is possible to use this approach for multiple-year balancing; see  for such an application, where the balanced result satisfies not only the accounting constraints, but also show movements that are as close as possible to the preliminary information. Moreover, in a broader context of balancing, not only accounting constraints, but also temporal constraints (i.e. the sum of quarterly accounts in a year equal to the annual ones) have to be satisfied in systems of time series. Readers who are interested in this area are referred to  for approaches for a reconciliation of both accounting and temporal constraints, and to the monograph  by Dagum and Cholette for general reconciliation methods.
Our application is done on the current prices. It is not trivial to extend the approach to the constant prices. The possibility to compute the uncertainty of the balanced aggregates, such as GDP, and the theoretical property Eq. (2) are of great interest. However, the conditions are very difficult to verify. All those are possible topics for our future work.
The authors are very grateful to the associate editor and three anonymous referees for their comments and suggestions, which have greatly improved the readability of the paper. The contribution of our colleagues to this study is acknowledged, in particular, the efforts to make the uncertainty estimates available. The study has been presented at several seminars and conferences. The authors thank the participants for helpful discussions and suggestions.
Eurostat. Task Force on accuracy assessment of National Accounts statistics – Final Report. Eurostat/B1/CPNB/296. 2001.
Stone, R., Champernowne, D.G., Meade, J.E. The precision of national income estimates. The Review of Econ. Studies. 1942; 9: 111-135.
Byron, R.P. The Estimation of Large Social Account Matrices. Journal of Royal Statistical Society, Series A. 1978; 141: 359-367.
Byron, R.P. Diagnostic testing and sensitivity analysis in the construction of social accounting matrices. JRSS, Series A. 1996; 159: 133-148. DOI: 10.2307/2983474.
Van der Ploeg, F. Reliability and the adjustment of sequences of large systems and tables of National Accounts matrices. JRSS, Series A. 1982; 145: 169-194.
Barker, T., van der Ploeg, F., Weale, M. A balanced system of national accounts for the United Kingdom. Review of Income and Wealth. 1984; 30: 461-485. DOI: 10.1111/j.1475-4991.1984.tb00491.x.
Chen, B. A Balanced System of Industry Accounts and Structural Distribution of Aggregate Statistical Discrepancy. Working paper WP2006-8, Bureau of Economic Analysis, Washington, DC. 2006. Available from: www.bea.gov/papers/pdf/reconciliation_wp.pdf.
Chen, B. A Balanced System of U.S. Industry accounts and distribution of aggregate statistical discrepancy by industry. J. Business and Econ. Stat. 2012; 30: 202-211. DOI: 10.1080/07350015.2012.669667.
Eurostat, Handbook on Quarterly National Account – 2013 Edition, Chapter 8 Annex C. Available from http://ec.europa.eu/eurostat/en/publications/manuals-and-guidelines. DOI: 10.2785/46080.
Fortier, S., Quenneville, B. Reconciliation and balancing of accounts and time series, from concepts to a SAS Procedure. Proceedings of the Business and Economic Statistics Section, American Statistical Association; 2009.
Bertsekas, D.P., Nedic, A., Ozdaglar, A. Convex analysis and optimization. Athena Scientific: Belmont, MA; 2003.
Jackson, R., Murray, A. Alternative Input-Output Matrix Updating Formulations. Econ. Sys. Res. 2004; 16: 135-148. DOI: 10.1080/0953531042000219268.
Biemer, P., Trewin, D., Bergdahl, H., Japec, L. A system for managing the quality of official statistics. J. Official Stat. 2014; 30: 381-415. DOI: 10.2478/jos-2014-0022.
Calzaroni, M., Puggioni, A. Evaluation and analysis of the quality of the national accounts aggregates. Report to the Task Force on Accuracy Assessment of National Accounts, Eurostat; 1998.
Bachrach, M. Biproportional matrices and Input-Output change. Cambridge: Cambridge University Press; 1971.
Mushkudiani, N., Pannekoek, J., Zhang, L-C. Scalar measures of uncertainty in reconciled economic accounts. Discussion paper 2016-09, Statistics Netherlands. Available from https://www.cbs.nl/-/media/_pdf/2016/36/2016-scala-measures-of-uncertainly.pdf.
Chen, B., Di Fonzo, T., Howells, T., Marini, M. The Statistical reconciliation of time Series of accounts after a benchmark revision. In JSM Proceedings, Business and Economic Statistics Section. Alexandria, VA: American Statistical Association. 2014. pp. 3077-3087.
Di Fonzo, T., Marini, M. Simultaneous and two-step reconciliation of systems of time series: Methodological and practical issues. JRSS Series C (Applied Statistics). 2011; 60(2). DOI: 10.1111/j.1467-9876.2010.00733.x.
Dagum, E.B., Cholette, P.A. Benchmarking, temporal distribution and reconciliation methods for time series data. New York: Springer-Verlag, Lecture Notes in Statistics #186; 2006. DOI: 10.1007/0-387-35439-5.