You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Of science and statistics: The scientific basis of the census1

Abstract

For millennia the census has been an instrument of public administration; it was used in a wide variety of formats counting different populations in different scientific disciplines. Over time a standard format emerged based on the scientific method; the population and housing census was formalised in the second half of the 19th century. Its function was the enumeration of every individual in the population of a specific geographical area at specific time, through direct contact of the households to obtain information about each household member. Practically parallel another administrative instrument was developed: the register, an officially sanctioned list of objects or events. Throughout time these two administrative instruments with complementary functions have contributed to the management of society. Since the second half of the 20th century the sample survey became another source of statistical information. At the same time some countries started to redesign their statistical organisation favouring registers and replacing the census, with a system of combined registers and sample surveys. Proponents of this approach indicate that this procedure has no theoretical basis and that there are methodological challenges with its implementation. This paper will review these developments and make a call for a science-wide review.

1.Introduction

During the 2010 World Population and Housing Census Programme, 2005–2014, 214 countries or territories carried out a census, only 21 did not, mainly due to political instability and security issues [1]. According to a survey carried out among countries in 2010, out of the 180 countries that responded a small number did not use the standard census methodology, but used alternatives like the “rolling census” as yet only used by France and a variety of “register-based” censuses used by 15 countries or areas, 12 in Europe, two in Asia, and one in Northern America [2]. According to the United Nations Economic Commission for Europe (UNECE) in the 2010 census round nine countries carried out their census using primarily a register-based approach, Austria, Belgium, Denmark, Finland, Iceland, the Netherlands, Norway, Slovenia, and Sweden [3]. The alternative census approaches were first mentioned in the 2008 version2 of the United Nations’ Principles and recommendations for population and housing censuses [4]. A census typology of four categories was presented: the traditional census, the rolling census, the register-based census and the traditional enumeration with yearly updates of characteristics. In this paper only the traditional and the register-based approaches are dealt with because the rolling census and the traditional enumeration with yearly updates of characteristics are only used by one country each, France and the United States of America respectively.

Throughout time all societies of a certain complexity required quantitative information to manage their affairs. The information was probably recorded in lists, inventories, or registers of persons, objects or transactions, kept by specific officials or scribes. The exact procedures used are not well documented. There is historical evidence that in addition to routine administrative records, occasionally ad hoc counts or inventories were taken to satisfy the demand for specific information. These two procedures were the precursors of what currently are registers and censuses. The historical development of the census has been closely linked with the emergence of science in general and statistics in particular; with regard to the underlying philosophical principals and technological and operational capabilities. In this paper the development of the census will be presented and emerging methodological and operational issues will be identified. This paper will consist of five additional sections. In Section 2 a description of the basic principles of science will be provided based on a review of relevant historical methodological literature. In Section 3 some salient events in the historical development of the census methodology will be presented, showing the convergence toward the established methodology of the census. In Section 4 the definition of the population (and housing) census and the introduction of the register-based census by the United Nations Statistical Commission based on the different versions of the Principles and Recommendations for (National) Population Censuses of the different decennial census rounds will be reviewed. In Section 5 a number of issues identified by proponents and users of the register-based census will reviewed and some of the methodological issues arising from the use of the register-based approach will be highlighted. In Section 6 the current situation of two approaches of the census will be placed in a historical and methodological perspective and a thorough review will be suggested on the basis of statistics being a self-regulating science.

2.The scientific method

2.1Introduction

The “Eureka” moment of Archimedes may never have happened, but the story provides a good idea of the status of scientific thinking in antiquity. It involved the solution of a concrete problem by direct observations of facts, and the formulation and testing of hypotheses. The facts were: displacement of liquids by submerging or submerged solid objects, and the hypothesis, the relation between the weight (mass) of the solid object and the volume and weight of the displaced liquid. Assumedly, concrete measurements were taken to establish the facts. Modern science essentially still follows the same basic principles and phases in the scientific solution of a concrete problem. However, currently the philosophical underpinning of the scientific method has been made explicit with some differences in emphasis. The procedures to be employed have been developed and standardised, and the means of observation (the instruments) have greatly improved and are differentiated according to problems faced by the different sciences.

The development of both science and statistics in the modern sense started after the Middle Ages in Europe and has historically been interrelated. Scholars in continental Europe, especially those of the German principalities (later Germany), and France worked on the formulation of the underlying principles of the general scientific method, whereas scholars in English speaking countries tended to dedicate their time and energies to study and solve specific practical problems. That is why an English translation of the authoritative study by August Meitzen on the history, theory, and technique of statistics was published in the USA in 1891 in two supplements of the Annals of the American Academy of Political and Social Science of 1891 [5, 6]. According to the translator, Ronald P. Falkner, “no exhaustive treatise on the subject exists in our language despite the valued contributions made by English science to the subject. Hence the translation of the present work has been prepared partly with a view to the opportunity which it afforded for the comparison of German methods of thought with our own” [6, p. 3]. During the second half of the 19th and especially during the 20th century important theoretical and practical developments and standardisation of concepts and procedures in both science and statistics took place.

Because of the complexity of the universe and the diversity of humanity and human societies, different sciences have been and are still being developed. There are different schools of thought about science and the scientific method but there seems to be a general consensus on the basics characteristics of the scientific method. There also seems to be general consensus on the phases of the cycle of the empirical scientific enquiry trying to respond to a concrete problem. By way of illustration the five phase’s cycle as proposed by de Groot is presented:

  • “Phase 1: ‘Observation’: collection and grouping of empirical materials: (tentative) formation of hypothesis.

  • Phase 2: ‘Induction’: formulation of hypothesis.

  • Phase 3: ‘Deduction’: derivation of specific consequences from the hypotheses in the form of testable predictions.

  • Phase 4: ‘Testing’: of the hypotheses against new empirical materials, by way of checking whether or not the predictions are fulfilled.

  • Phase 5: ‘Evaluation’: of the outcome of the testing procedures with respect to the hypotheses or theories stated as well as with a view to subsequent, continued, or related, investigations” [7, p. 28]

In any scientific enquiry there is a role for statistics as a science; elements of descriptive statistics are of importance in phase 1 and so are elements of inferential statistics in phase 4. As part of the scientific enquiry statistics has to comply with the principles of science. However, current treatises on science and statistics do not provide a description of the underlying principles of science in general and of empirical science, including statistics in particular. One has to review historical studies on science, the scientific or statistical methods to be able to extract the main underlying principles of the scientific method. In this respect the treatise of Meitzen will be used as main text supplemented with references from other relevant publications.

2.2Basic principles of the scientific method

There are a multitude of definitions of what statistics as a science is. Nevertheless, there seems to be a general consensus about the object of statistics as a science and nature of the statistical method. The basic principles of the empirical science and the scientific method can be summarised as follows:

  • Any empirical science observes physical objects (people, animals, celestial objects, rocks, etc.) and their characteristics at a clearly delimited location and specific time. With regard to statistical queries Yule stated that “Any statistical inquiry is necessarily confined to a certain time, space or material” [8, p. 17]. Deming stated it as follows “Empirical investigation consists of observations on material of some kind” [9, p. 1885]. The physical objects are referred to as units of observation, units of count or units of enumeration.

  • Meitzen defines the statistical method as “a process, based on an enumeration of characteristic phenomena, of forming empirical judgments and conclusions relating to the varied and complicated aggregates of existence” [6, p. 106]. The object of study is the aggregate, universe or population of certain things. However, the observation and subsequent enumeration should be done “with a certain purpose in view, and to observe merely such things whose number in the particular aggregate it is necessary to ascertain for this purpose. Things which are to be sought must be known beforehand” [6, p. 108]. Yule defines statistics as “quantitative data affected by a marked extent by a multiplicity of causes” and the statistical methods as “methods especially adapted to the elucidation of quantitative data affected by a multiplicity of causes” [7, p. 4]. For Yule the objects of the statistical methods are given and can be processed or manipulated by the statistical methods, whereas for Meitzen the data have to be acquired through a purposeful process of observation and enumeration. Consequently the observable object needs to comply with certain conditions.

  • The physical object and their characteristics need to have an objective existence at least as long as the observation period or as Meitzen stated “A necessary premise of enumeration is that all the qualities of an object which characterize it as the unit of the count must be fixed and invariable for the period of the observation upon which the count is to be based” [7, p. 115].

  • Only concrete things can be observed and enumerated. Hence, these “these objects can only be enumerated according to a previously well-defined idea. The things to be included in the enumeration must correspond entirely with the preconceived notion of the unit of enumeration [7, p. 117]. This implies that the physical objects and their characteristics need to be defined before the observation, as should the observation and enumeration procedures.

  • All physical objects that possess the defined characteristics need to be enumerated.

  • The enumeration of the objects and their characteristics need to be carried out with the same procedures and with reference to a specific location and time.

  • If multiple elements with specific characteristics are to be enumerated, they have to be enumerated with the same procedures at the same location and time.

  • Characteristics of an object cannot be observed and enumerated directly, but only through the observation and enumeration of the object.

  • The concept of physical object requires some clarification as in some of the sciences not all objects are visible, but their existence can be inferred by observable or measurable effects. This is the case in nuclear physics (subatomic particles, such as the Higgs boson) and astronomy (black holes or exoplanets).

  • In the scientific process all objects, elements, characteristics or qualities and procedures should be uniquely defined and the definitions should be consistently applied.

  • Deming emphasises that the statistical inference based on the findings of a specific study is only valid for the specific population studied, its location, time, the circumstances and conditions and procedures under which the study was carried out [9].

    This is the reason why the ceteris paribus principle is invoked. Repeated studies of populations with the same objectives and procedures carried out in different locations and times will lead to a cumulative body of evidence that will allow for the formulation of statements of statistical regularity, which might eventually lead to statements of facts or statistical laws and ultimately to the formulation of scientific laws.

In the second half of the 19th century a distinction was made between the direct and indirect observation according to who carried out the observation. When a researcher himself observed or measured an object and its characteristics this was considered a direct observation. However, the observation was considered indirect if the researcher was using the records or results of an observation made by someone else. The indirect observation was acceptable if the material in the records or registers fully and correctly reflected the findings of the direct observation. This is similar to what presently is considered primary research and secondary research. Indirect observation does not imply that the activity is not scientific. Much of astronomy, even today, is based on indirect observation in the 19th century sense.

In science, the object and their characteristics are observed at a certain location and a specific time. The majority of the administrative data refer to events, or occurrences that take place at one specific moment but do not exist as independent observable units thereafter, e.g. births, deaths, registration of property etc. The Dutch statistician Verrijn Stuart observed, that “Of these phenomena that occur intermittently, there can be no question of an observation at a specific time. It would make no sense to count the number of born at midnight of a certain day or the number of crimes committed at a specific time. The time period (usually a year) replaces the moment” [10, p. 34]. Hence, period data, do not have a uniquely defined reference time.

These are some of the main underlying principles of the scientific method, and some of the unwritten practices followed by scientists. The census as a generic scientific tool used in many disciplines, including population statistics, has to comply with these principles and rules.

3.Historical development of the population census

There is historical evidence that in the ancient civilisations of Babylon, Egypt, China and the Americas in addition to routinely kept administrative records, occasionally ad hoc full counts or enumerations were taken to satisfy the demand for specific information. These enumerations have taken several forms and a classification of these procedures by Rolf Gehrmann of the Max Planck Institute for Demographic Research to classify early German enumerations up to 1871, can be applied to the overall historical development of the census. In the classification he made the following distinctions:

“Distinction by the product:

  • 1. Anonymous table (if nothing else was noted, then the underlying procedure can be called a count),

  • 2. Nominal listing of the household heads with only numeric information about the members of the household (the underlying procedure can be considered to be an enumeration), and

  • 3. Census list with individual names and data for all persons. Distinction by the procedure of the collection of information:

    • a) Extract of a population register (Registerzählung = Register count)),

    • b) Transcript of the statements of convened household heads (Protokollzählung = Protocol count), and

    • c) Collection of the data from house to house (Naturalzählung = Regular count = Census)” [11, p. 5].

The category of the regular count was considered the true census, and was standardised and formalized at the first International Statistical Conferences in 1853 in Brussels [12].

In medieval Europe the rulers of the political entities, city states, duchies and kingdoms required information to manage their affairs. Records on a wide variety of social and economic topics were kept in more or less systematic fashion. Of special interest were issues related to taxation and defence (the number of men capable of bearing arms), which was especially of relevance after a change of ruler or government or after the conquest of new lands. Most of these records were relative simple counts of single elements, heads of cattle, taxable persons, and persons of military age, etc. However, some could be quite complicated undertakings, displaying elements of what later would become the characteristic of modern data collection procedures. Reference is made to observation protocols, list of questions (forerunners of the questionnaires), a specific organisation for the collect of the information, procedures for the collection and verification of the completeness and quality of the data.

The Domesday Book of England was probably the most comprehensive data collection exercise ever carried out in the Middle Ages. When faced with the need to increase taxes to finance military operations, at Christmas 1085 William the Conqueror ordered an inventory to be made of the resources and taxable value of all the manors and the boroughs in England. A special administrative structure was set up for the collection and verification of the information. The information to be collected was contained in ten questions which had to be recorded in reference to three times: at the time before the conquest (1066); the time when the manor was granted by William (1066 or later) and the time of the inventory 1086. The questions referred to: the name of the manor, the name of the owner(s), the amount of different types of land (woodlands, arable land, meadows, etc.), the types of means of productions, the number of cattle and livestock, the population by status (including slaves), the possessions of the different population groups, the value of the property, and changes between 1066 and 1086, The exercise probably started in January 1085 and was probably discontinued in September 1087 when William died. The results are conserved at the National Archives in London and can be consulted online [13]. The Domesday Book could be considered a census of manors and showed that even in the medieval period it was possible to organise complex data collection. This was possible because the rulers could obtain the services of well-educated persons who would be well versed in the principles and practices of the sciences of the day; philosophy, law and mathematics.

During the last half of the 16th century but especially in the 17th century important changes took place in European societies. These affected the economic activities and structure of societies, and the emergence of new philosophical ideas that emphasised reason, logic and freedom of thought over authority, dogma and faith. It was the period leading up to the Enlightenment in Europe and to the development of the modern idea of the State. It is also the period that saw the emergence and development of the empirical sciences, the development of instruments (telescope, microscope) to measure natural phenomena. Efforts were also made to apply the measurement principles to social phenomena as well. The data requirement of the emerging states led to the collection of data on a wide variety of subjects in society, including the population size, as a large population was increasingly considered an asset to the state and an important indicator of the wealth of a nation. To satisfy these demands the data collection began to apply the emerging scientific method of the empirical sciences to the state, society and different social and economic activities.

At the beginning of the 17th century France had an administrative system capable of collecting quantitative information for taxation and defence, and the clergy maintained a basic system of registrations of baptisms, deaths and marriages. Due to political developments, internal, centralisation and external, the Thirty years war with Spain, there was an increasing need for information on the new sectors of the economy, as well as for better ways of measuring the size of the national population. There were different systems to obtain information on the population size depending on the needs. A common method to estimate the population size was to enumerate the number of households (the hearths or fireplaces), and multiply this with the average household size. The need to have detailed information about each person was recognised but this could be achieved only in small populations, as up to then there was no proper instrument to do so for larger populations.

In 1686 a booklet (15 pages) was printed anonymously in Paris and was distributed among a selected number of persons in government and friends and acquaintances of the author. Its title was “Méthode generalle et facile pour faire le dénombrement des Peuples” (“A general and easy method to enumerate people”) [14]. It proposed a system for the enumeration of the population, by obtaining information on the composition of households (families) in each street of the lowest administrative level and successively adding up the information for each level, till the highest administrative level desired was reached. It recommended the use of standardised printed forms of two standardised two-dimensional tables. In the form for the households the names of the head of the household was entered in the rows and information of the household and its members was recorded in the columns. For each household the numbers of persons (family members and domestic servants) were recorded according to sex and rough age groups, such as adults, grown up (grand) and young (petit) children. The author provided detailed instructions as to what information should be recorded and how. This could be considered as a precursor of the household schedule. In the higher level table the cumulated information of each of the lower level units (say village) was presented in the rows, and the characteristics of the recorded households (families) and their members was provided for in the columns. In a set of additional columns information on a number of additional items such as land, by type, and the type and number of animals kept by each unit was recorded. The author also provided detailed information on how the data collection should be organised using existing administrative structures and resources but also by employing additional temporary resources to overcome operational and motivational constraints. There are indications, scattered in French historical documents, that the proposals by the author were well accepted by persons in power and that they were applied in France and its colonies. It is only in the 20th century that the French historian Edmond Esomin (1877–1965) was able to establish the identity of the author of this extraordinary booklet in 1954 [15]. The author turned out to be Sébastien Le Prestre de Vauban (1633–1707) [16]. He is better known as a military officer, and an engineer specialised in the construction of fortifications and the art of siege craft, who served during the reign of Louis XIV. He was a prolific writer, including on non-military matters; best known is his treatise on the reform of the French tax system, “La Dime Royale” (The Royal tithe) [17] which was initially published anonymously in 1707, although it was widely known who the author was. In a chapter of this book he provided a slightly modified and expanded version of his method. One of his recommendations could be considered as a precursor of division of the country in census areas and enumeration areas (EAs) and the establishment of the workload of one enumerator in modern censuses, namely 50 households. [17, pp. 164–165]. In 1975 Eric Vilquin provided a detailed overview and assessment of Vauban’s “Methode” and its use in France, its colonies and Europe. He extended the influence of Vauban and his method beyond the end of the 18th century, when he stated: “Indeed, the many censuses of the revolutionary epoch, of the Consulate and the Empire, often very different from each other, by objective and method, are for the most part based on the most cherished general principles of Vauban: distribution of pre-printed tables to the enumerators, individual enumeration by sex and marital status” [18, p. 244].

Throughout the 17th and 18th century all over Europe counts and enumerations in different formats were carried out. At least two enumerations are considered forerunners of the modern census because all persons were enumerated separately with their individual characteristics. These were the 1665 census of the population of Quebec (New France) [19] and the 1703 census of Iceland which was then a dependency of Denmark [20].

In the German principalities and the Nordic countries, several procedures to enumerate the population were used. In some, special church records were kept, the community registers (“Seelenregisters” soul registers) which were used to count the total population, the register count. Gehrmann stated: “They contain all the information of a census, and they can be regarded as equivalents [to the census] provided the date of the creation of the lists is clearly defined, and the whole population is registered” [11, pp. 5–6]. Compare this with the statement on the indirect observation in Section 2.2. Probably the best preserved example of such a soul register is that of the diocese of Münster for the years 1749–1750. The reason for establishing this register was a pastoral challenge. The Prince-bishop was concerned about the cohabitation of unmarried persons or persons who were not engaged to be married. By edict of 3 October 1749 all parish priest were ordered to establish an annual list of all their parishioners by household indicating names, sex, age, marital status and occupation. After some delays, due to the reluctance of the priests to undertake this task, for the years 1749 and 1750 such lists were prepared for practically all parishes of the diocese and were deemed to be nearly complete. The original records have been preserved and can be consulted in print or on-line [21]. One of the on-line organisations describes it as follows: “The status Animarum of 1749 and 1750 is the first almost complete survey of the population of the diocese of Münster and offers the first census according to standard specifications, even if they were implemented very differently in the parishes. Nevertheless, the status Animarum provides a good overview of the total population, with information on households and inhabitants, age, occupations and other information that can be used for family, regional, social and demographic research” [22].

In the early 19th century in some countries municipal population registers were kept, which could be used to establish the size of the national population. Some countries started to have regular census programmes; the United States had started its decennial enumeration programme in 1790 and the United Kingdom in 1800. The first five enumerations in these countries were not true censuses, because they did not enumerated each individual separately. That started in 1840 in the UK and in 1850 in the USA. In 1846 Belgium implemented a census programme consisting of a population, an agricultural and an industrial census. The population census was considered a model of a true population census. It used individual enumeration, questionnaires, a special organisation for field operations (along the line of the Vauban system) and used special enumerators to collect the information in the field [23]. While the population census was considered a success, the agricultural and industrial censuses were not.

In the second half of the 19th century international efforts were made to standardise the methodology of the population census, under the dynamic leadership of the Belgian scholar Adolphe Quetelet. These efforts culminated in a series of International Statistical Conferences between 1853 and 1876. At the first conference held in 1853 in Brussels detailed recommendations on the population census, its methodology and characteristics were made.

These included: “1. That the census of population should exhibit the number of individuals actually in the country at the date of enumeration; and also such particulars as may be required of those individuals who have legal domicile in the country, although absent from it; 2. The census to be taken not less frequently than every ten years, and in the month of December; 3. A special return for each family or household to be employed; 4. Employ special agents or enumerators; and 5. The returns to state name and surname, age, place of birth, spoken language, religion, condition, whether single, married, or widowed, profession, or occupation, residence, whether temporary or permanent, children receiving education, houses by stories, and number of rooms occupied by each family, gardens in connection with the house, existing sickness, number of blind, deaf and dumb, absentees, and number of persons residing in public or private establishments” [12, p. 5]. These were reconfirmed and expanded at the eighth conference held in 1872 in St. Petersburg. The main recommendations dealt with the type of population to enumerate, frequency of census taking (once in ten years, preferably in years ending in zero), reference to the census date, use of the family or household to identify individuals, and use of a questionnaire and specially trained enumerators to collect the information. The questionnaire should include questions on: name, sex, age, relation to the head of the family and household; civil state or conjugal condition; profession or occupation; religious affiliation; language(s) spoken; ability to read and write; origin; place of birth; nationality; usual residence; nature of the residence; where the census took place; and whether the individual was physically or mentally disabled. The Congresses also recommended that annual population registers be maintained recording births, death and marriages [24, pp. 443–445]. At the St. Petersburg congress an outstanding issue since the first congress, the type of population to enumerate, was resolved as follows: “The rules for indicating the usual residence, temporary residence, legal domicile, etc., are for the present left to the arrangements of the different States” [24, p. 445]. The congresses confirmed the scientific bases of census and the direct data collection of information of all individuals in a household (family) by house-to-house visits “in one day or at least relate to a fixed day and hour” [24, p. 444], the use of special agents or enumerators, and the use of a questionnaire as the standards for population censuses. Some countries continued to use the register count beyond the end of the 19th century. The main functions of the census were to obtain updated information on the total population size, its socio-economic characteristics and specific topics of national interest and the verification of the completeness of existing population registers [25, p. 2] and to correct errors in the municipal population registers [26, p. 488].

Adolphe Queenlet passed away on 17 February 1874; after the Ninth International Statistical Congress held at Budapest from 1st to 7th September, 1876 no more congresses were organised. At the Jubilee meeting of the Royal Statistical Society (RSS) in London in 1885 the International Statistical Institute (ISI) was established to continue the tradition of the development of statistics as a science and the standardisation of methods through international meetings of statisticians [27]. The late 19th and the early 20th century saw important developments in statistical theory and techniques, the development of the sample survey. The ISI, the International Labour Organisation (ILO) and the League of Nations continued to promote statistics, including the census methodology until 1940. After the Second World War the Statistical Commission and the Population Commission of the United Nations became responsible for the promotion of the census methodology. In the second half of the 20th century there were important theoretical developments and operational improvements, the introduction of the sample survey and electronic equipment, in statistics. The census benefitted from the introduction of the cartographic method to improve the quality of the data collection. The development of the post-enumeration survey [28] provided a means to establish the completeness of the census.

4.The United Nations and the census

4.1Introduction

Since 1947 the United Nations and its agencies promoted the development and standardisation of the census methodology. In 1949 it issued its first methodological publication “Population Census Methods” [29], followed in 1958 [30, 31] and 1969 [32, 33] by separate manuals on population and housing censuses. Since 1980 its publications cover population and housing censuses together. In P&R 1980 a technical linkage was established between the housing and population census, the linkages of the population census with other types of censures was highlighted and a regional approach promoted [34]. This manual, with its 1990 supplement [35] and 1997 revision [36], set the global standards for census taking which remained valid till 2008 when a second revision was issued [4]. In 2015 the Statistical Commission approved a third revision of the Principles and Recommendations for Population and Housing Censuses [37]. In addition to these general publications the United Nations and its agencies have published a large number of technical handbooks, manuals and guidelines on specific aspects of the census methodology and topics covered in the census. The United Nations system established a global framework for the execution of national censuses through the decennial World Population (and Housing) Census Programmes.

4.2The definition of the census and its essential features

In 1958 the United Nations Statistical Commission defined the population census as: “the total process of collecting, compiling and publishing demographic, economic and social data pertaining, at a specified time or times, to all persons in a country or delimited territory” [30, p. 3]. The definition remained basically unchanged till 2015, even though in P&R 1969 two new elements were included, namely “evaluating” and “analysing” [32, p. 2]. In P&R 2017 two more new elements are added to the definition of “planning” and the “capacity to produce small-area statistics” [37, p. 2]. The definition of the population census became: “A population census is the total process of planning, collecting, compiling, evaluating, disseminating and analysing demographic, economic and social data at the smallest geographic level pertaining, at a specified time, to all persons in a country or in a well-delimited part of a country” [37, p. 2]. The addition of “planning” does not affect the essence of the process nor the definition. It is not known why the “capacity to produce small area statistics” has been included in the definition of P&R 2017 because it was common practice to prepare, but not necessarily publish, tabulations for the smallest of the operational areas of the census, namely the enumeration area (EA). Also, one of the recommendations of P&R 1980 was that “Data from population censuses may be presented and analysed in terms of statistics on persons, and for a wide variety of geographical units ranging from the country as a whole to individual small localities or city blocks” [34, p.2]. In essence nothing changed in the definition of the census, except that efforts were made to legitimise the so-called register-based census. This idea is further supported by the way the cartographic basis of the census has been treated in the different versions. The cartographic basis of the census was not included in the essential features of the census; because some countries did not use this approach. In the P&R 1980, P&R1997 and P&R 2007 under the section “plan of enumeration” a statement is added consisting of two parts. The first part dealing with the division of the country in enumeration areas was the same in all three versions: “The universal enumeration of population and living quarters should be made exclusively on a geographical basis, that is to say, the country should be divided into census enumeration areas and each area should be small enough to be covered by one enumerator during the period of time allowed for the enumeration” [34, p. 25]. The second part dealt with the use of alternatives to the universal enumeration and in P&R 1980 [34, p. 25] and P&R 1997 [36, p. 21] it was formulated as: “Other sources of information, such as registers of population or registers of properties, cannot normally be considered adequate for the purpose of a census, although they could be used for checking the completeness of the enumeration or the accuracy of the replies to certain questions.” In P&R 2008 this is replaced by the following statement “Other sources of information, such as registers of population or registers of properties, could be used to produce census data in countries that have established continuously updated population registers of high quality and good coverage” [4, p. 57]. In P&R 2017 does no longer contain a section on plan of enumeration.

The above presented treatment of the sources of enumeration is revealing. Up to the 2000 census round the international consensus seems to be that the enumeration of the population should be based exclusively on a cartographic approach, with the exclusion of the use of “registers of population or registers of properties” as a source of useful information for census purposes. However, ten years later, in P&R 2008 the exclusive use of the cartographic approach seems to be recommended, but contradictory, the use of registers as a source of census data is allowed but only for countries that have “continuously updated population registers of high quality and good coverage”. No justification for this contradictory approach is given. In P&R 2017, again ten years later, there is no more discussion of the preferred method of census taking, and there is no reference to the cartographic approach, but the register-based approached is extensively covered, without explanation of justification.

In addition to the census definition in each publication reference is made to the essential features of the census, providing additional explanatory information of the census process. These explanatory statements were meant to emphasize the scientific nature of the census and its operations. In P&R 1958 six essential features of an official national census were listed: Governmental sponsorship, a defined territory, universality (counting every member of the community), simultaneity (the total population should be counted with reference to a “well-defined point of time), individuality (each persons should be enumerated separately and directly and not by registration), and the results to be compiled and published. In P&R 1969 the essential features had been reduced to four by excluding Government sponsorship, and the compilation and publication. This remained unchanged till P&R 2017 when the capacity to produce small-area statistics was added as a fifth feature. In the six versions of the Principles and recommendations four common essential features were considered: individual enumeration, universality within a defined territory, simultaneity and defined periodicity. The “defined periodicity” is not a methodological but an operational feature, hence there are three common methodological essential features simultaneity, universality within a defined territory and “individual enumeration that need to be considered.

With regard to simultaneity, in spite of a slightly shorter description in P&R 1958, it remained the same in all six versions. Using P&R 1969 as example, it stated “Each person should be enumerated as nearly as possible in respect of the same well-defined point of time and the data collected refer to a well-defined reference period. The time-reference period need not, however, be identical for all of the data collected. For most of the data, it will be the day of the census; in some instances, it may be a period prior to the census” [32, p. 2]. This essential feature refers to two different elements of the census process. The first is the need to enumerate all persons with reference to a uniquely defined time, the census date. The second refers to the content of the questionnaire, in which data are to be collected with reference to the census date, but period data should refer to a clearly defined period (week, month or year) linked to the census date. Hence, simultaneity refers to the enumeration of persons and the reference of the variables used in the census.

With regard to universality within a defined territory in P&R 1958 universality and defined territory were treated separately, but together they express the same ideas as in the other versions. In P&R 1958 and P&R 1969 a special reference to the completeness of the enumeration was made, by adding “without omission or duplication” 3̧0,32. The other four versions do not have this reference. P&R 1958 also contained an additional provision on sampling, which read: “The above description of a census does not preclude the simultaneous use of sampling techniques for obtaining data on supplementary topics. Basic information which is to be tabulated for small geographic areas or for which detailed cross-tabulations are required should, however, be collected for every person” [30, p. 3]. This provides the justification for the use of the long and short forms of the questionnaire. No such provision was made in the other five versions. However, in P&R 2008 and P&R 2017 the following provision was included: “This does not preclude the use of sampling techniques for obtaining data on specified characteristics, provided that the sample design is consistent with the size of the areas for which the data are to be tabulated and the degree of detail in the cross-tabulations to be made” [4, 37]. The intention of this description is not clear. It seems that it is included as a justification to use sampling to obtain additional information for small areas in cases where the census is not based on the cartographic approach.

The treatment of the essential feature “individual enumeration” was in essence the same in all six versions. In P&R1969 this requirement was also extended to “a representative sample of the total population” [32, p. 2]. In all versions, except P&R 1958 it was specifically mentioned that this feature was a precondition for cross-tabulating the characteristics of the population. P&R 1980 reads: “A census implies that each individual (and each living quarters) are enumerated separately and that their characteristics arc separately recorded. Only by this procedure can the data on the various characteristics be cross-classified” [34, p. 3]. In P&R 1969, P&R 1980 and P&R 1997 the following statement was included: “Individual enumeration does not preclude the use of sampling techniques for obtaining data on specified characteristics, provided that the sample design is consistent with the size of the areas for which the data are to be tabulated and the degree of detail in the cross-tabulations to be made”[32, 34, 36]. This statement is identical to the one made under the special feature “universality within a defined territory” for P&R 2007 and P&R 2017. Why the transfer from one essential feature to another is not explained, and the observations made then, are also applicable in this case.

In P&R 2008 and P&R 2017 the following statements was included: “The requirement of individual enumeration can be met by the collection of information in the field, by the use of information contained in an appropriate administrative register or set of registers, or by a combination of these methods” [4, 37]. This statement which is supposed to provide an explanation of how individual enumeration can be achieved creates more problems than solutions. The use of information contained in an appropriate administrative register is only possible under specific conditions supported by principles of the scientific method. See reference to the indirect observation in Section 2.2 and the register count in Section 3. There is no justification for the statement that individual enumeration can be achieved by the use of sets of registers. No justification is provided for the use of combined methods. It seems that this statement was included as a justification of the practices used by the countries that did not use the standard census approach.

In general the United Nations Principles and Recommendations throughout the period 1958–2017 used the same definitions of the census, with some minor modifications, and also used the same descriptions for the key essential features, with some variations, that may not always appear clear or methodologically justified. It seems that after 2000 the Principles and Recommendations shift their focus from presenting methodologically approved or correct practices, to a presentation of any practice followed by some countries to obtain “census-like” tabulations. In latter part of the 20th century and the early 21th century there have not been any major changes or advancements in the theory of science, philosophy or theory of science, or the methodology of statistics to indicate a shift in the tenets of the scientific method. The scientific basis for this statement remains unclear.

4.3The Registers-based census

In P&R 2007 the definition of the census and the description of the essential features were basically the same as in previous versions, with the exception of the additional statement in the part dealing with individual enumeration as a special feature. However, in a section entitled Methodological approaches four types of census approaches were introduced. These were: 1. the traditional approach, 2. the register-based approach, 3. the rolling census approach, and 4. the traditional enumeration with yearly updates of characteristics [4, pp. 17–22]. No justification for these approaches is given, but in paragraph 1.58 it is stated that: “As part of their preparation for the 2010 global round of population and housing censuses, some countries are developing, testing, and implementing alternative methods for collecting, processing and disseminating key statistics that used to be generated by the traditional approach to population and housing censuses. Even so, the crucial principle of providing detailed statistics at the lowest geographical level remains of paramount importance” [4, p. 17]. Hence the typology is just a list of approaches tried by countries without any indication of their scientific basis or merit,

The traditional approach is the census in general parlance and in the general scientific literature. Why the term “traditional” was added has not been explained or justified. All language versions of the Principles and recommendations used the term “traditional”, except the French version that consistently used the term “recensement classique”, the classical census [38].

The register-based approach was introduced as follows: “The concept of producing census-like results based on registers emerged in the 2000 round of censuses, although it has been debated and tested to various degrees since the 1970s, and several countries succeeded in using this approach to generate census data in the 1990 round of censuses. The philosophy underlying this concept is to take advantage of the existing administrative sources, namely, different kinds of registers, of which the following are of primary importance: households, dwellings and individuals. In the next iteration these are linked at the individual level with information on business, tax, education, employment and other relevant registers.” The existence of a “a unique identification number for each individual, household and dwelling is of crucial importance, as it allows much more effective and reliable linking of records from different registers” [4, p. 19]. Note that this approach was meant to produce “census-like results” and that for the linking of register data only the existence of a unique identification number was mentioned.

To use this approach the following conditions are mentioned: the existence of “an established central population register of high quality and good coverage linked with a system of continuous updating. In the case of local registers, continuous updating along with communication between the register systems must be good. It is essential to harmonize the concepts and definitions when linking registers, and forming the linkages will be difficult when no universal personal identifier exists. Quality assessments should be conducted. If these conditions are not met, the country should rely on the population census as the primary source of benchmark population statistics” [4, p. 19]. It can be assumed that the reference to the population census was to what previously had been called the traditional census.

The main advantages of this approach were presented as “reduced cost for the census process and greater frequency of data” [4, p. 19]. The necessary caveats are made with regards to the costs of creating and maintaining registers and the need of regularly updating the register, and the consequences that non-compliance will have on the quality of the results. No definition of a register or updating was given. In P&R 2017 a more elaborate treatment of the methodological and organisational aspects of the different censuses approaches are given, describing a number of combined procedures that were used to complete the census operation. In this version definitions of register and updating were given but they are at variance with the usual definitions used by statistical services. Moreover the definition of updating is partially circular. The concepts were defined as follows: “A register is defined as systematic collection of unit-level data organized in such a way that updating is possible. Updating is the processing of identifiable information with the purpose of establishing, updating, correcting or extending the register” [37, p. 17]. A generic definition a register is “an officially sanctioned list of objects or events”, or using the approach of the UK Government Digital Service a register can be defined as “an official list of uniquely defined records, which contain standardised (raw) data of a specific type of object in existence at a specific time” [39]. The process of updating is important as it will allow the register data to be used to create “greater frequency of data”. The most generic definition of the process of updating can be obtained from the section on National Compensation Measures of the Handbook of Methods of the Quarterly Census of Employment and Wages of the U.S. Bureau of Labor Statistics which states; “Update. The process of collecting current information from an initiated sample unit” [40].

The emergence of the register-based census, the substitution of direct data collection with the use of existing administrative records and registers has two origins; the first was the case of the Nordic countries after the Second World War and the second was the situation in Western Europe, more precisely in the Netherlands and Germany. The Nordic countries have had a long tradition of maintaining registers and of using parish registers to obtain total population counts, the register counts, before carrying out population censuses. After the Second World War they introduced the Personal Identification Number (PIN) for their citizens to make their public administration more efficient. During the sixties Denmark (1969), Finland (1969), Norway (1964) and Sweden (1957) introduced a Central Population Register (CPR). The PIN and advances in the use of electronic equipment in the public administration allowed them to create combined registers. Denmark was the first country to produce a complete set of population and housing census-like tables using this procedure in 1981. But, “the world’s first totally register based population and housing census in 1981 was both the first and the last ever published in Denmark. Denmark still compiles censuses based on administrative registers, but only to fulfil international commitments and the data are not published by Statistics Denmark. The census is now only one of several so-called integration registers in a statistical system where administrative data, transformed into statistical data, are used and reused in a number of statistical products [41, p. 44]. The production of a full set of census-like tables using exclusively registers was achieved in Finland in 1990 and in Norway and Sweden in 2011 [3, p. 5, 8].

The second origin was the situation of the 1971 census of the Netherlands. During the sixties and seventies there was widespread discontent and distrust of governments and their intentions in the countries of Western Europe. In Germany and the Netherlands this led to opposition to population census because some of its questions were considered intrusive. In the 1971 population census of the Netherlands, the non-response rate was 0.2%, which was considered high. The authorities decided not to take legal action against those who had not participated in the census despite their legal obligation. In the preparations for the 1981 census it was estimated that the expected non-response rate would be 26%. The authorities decided not to proceed with the population census. To comply with its obligations as a member of the European Union the national statistical office developed alternative procedures using population register data and sample surveys to produce census-like tables [25, 26]. Hence the origins of the register-based census were a desire to improve the efficiency of the national administrative systems, including cost reduction, technical capability and government’s acceptance of citizens’ concerns about sensitive social and political issues.

5.Statistics, the census and the register-based census

As indicated in Section 3, the science of modern statistics has its origin in the 17th century and is a synthesis of philosophical and mathematical principles applied to concrete problems in the universe and society. However, there is no universally accepted definition of what statistics as a science is. The online glossary of the Organisation for Economic Co-operation and Development (OECD) defines statistics as: “Numerical data relating to an aggregate of individuals; the science of collecting, analysing and interpreting such data” [42, p. 747]. This definition describes two different elements, the numerical value of an object and a scientific discipline. The Cambridge Dictionary of Statistics, 4th edition has a similar treatment of the term, but does not define the discipline, but gives a description of several options. It defines a statistic as:” A numerical characteristic of a sample. For example, the sample means and sample variance” [43, p. 411]. Statistics is defined as “Either the plural of statistic or the name of a discipline that many have tried to define; some examples are Statistics may be regarded as (i) the study of populations, (ii) as the study of variation, (iii) as the study of methods for the reduction of data. …There is clearly no consensus but certain elements appear in most definitions namely, variation, uncertainty, and inference. One thing that statistics is not is simply a branch of mathematics” [43, p. 413].

Although the census has been in use for millennia as an administrative activity one very seldom would find a definition of a census in the literature. When the methodology of the population census was established at the International Statistical Congress in 1853 and confirmed in 1872 no definitions were given, but a set of its characteristics and recommendations for its execution were provided. The congresses confirmed the scientific bases of census and the direct data collection of information of all individuals in a household (family) by house-to-house visits “in one day or at least relate to a fixed day and hour” [24, p. 444] by special agents or enumerators, through the use of a questionnaire as the standard elements for population censuses.

At present there are a large number of definitions of the census in general and specialised dictionaries, but they seldom convey the essence of the census. The online glossary of the Organisation for Economic Co-operation and Development (OECD) defines the census as “a survey conducted on the full set of observation objects belonging to a given population or universe”. It adds as context: “A census is the complete enumeration of a population or groups at a point in time with respect to well defined characteristics: for example, population, production, traffic on particular roads. In some connection the term is associated with the data collected rather than the extent of the collection so that the term sample census has a distinct meaning” [42, p. 94]. No definition or description of the term sample census has been found in the OECD glossary. The Cambridge Dictionary of Statistics, 4th edition defines the census as: “A study that aims to observe every member of a population. The fundamental purpose of the population census is to provide the facts essential to government policy-making, planning and administration. [SMP Chapter 5]” [43, p. 72]. The reference SMP cannot be found in their references.

The census is a complex operation covering a wide range of activities. The census methodology is designed to assure the completeness of the enumeration and the quality of the census results based on the validity and reliability concepts through adherence to scientific principles. The details of the procedures can be obtained from the description in any of the versions of the United Nations Principles and recommendations or by consulting the presentation by the author on the census related to the health sector [44].

Although the methodology is covering all aspects of the census procedures the implementation is often far from perfect and issues related to the completeness of the census, the reference date, the quality of the variables especially the imputation of missing values often arise.

  • The completeness of the census can be measured by a post-enumeration survey (PES) [28], but not all countries carry them out and often do not publish the results. In spite of repeated requests by some developing countries the international statistical community has failed to establish a grading system of the quality of the completeness of the census.

  • Issues arise sometime with the correct application of the concept of the reference date (the census date). Ideally for de facto enumeration the census should be carried out in one day, the census day, but this is in practise not possible. Hence the enumeration period starting on the census day is extended. The longer the enumeration period the higher the propensity for errors, and often organisers fail to make the necessary arrangement in the questionnaire and the data processing phase to ensure that the data are referring to the correct census date. The use of the usual residence concept requires careful designed questions in the questionnaire to ensure compliance with the reference date.

  • Procedures to establish the quality of the data (missing or incorrect data) from the field exists, but it becomes increasingly difficult to correctly assess the quality of the data and missing (or incorrect) data are often modified (imputed) without taking into consideration the reasons behind the missing data. Imputation is methodologically justified only for randomly distributed missing data!

  • The United Nations recommends that a detailed administrative report “which is a record of the entire census undertaking, including problems encountered and their solutions” [37, p. 140] is prepared. At the last census round few countries published such reports.

The description of the register-based census operation is less detailed as there are no standard procedures for this type of operations. The basic idea is to use one or more basic (base) registers, and a number of auxiliary registers to create a complete data file of the registered population with a complete set of variables, constructed from a variety of registers, sometimes supplemented with data from sample surveys to create the required tabulations. However, some of the main proponents of the use of registers to replace direct data collection methods have indicated that there is no theoretical basis for these procedures. In this paper only the ideas of B. F.M. Bakker (Statistics Netherlands) and Anders and Britt Wallgren (Statistics Sweden) will be briefly presented.

In a paper on the procedures of the micro-integration at the Joint UNECE/Eurostat Expert Group Meeting on Register-Based Censuses in The Hague, Bart F. M Bakker stated that “One of the limitations of registration data is that they usually have a small number of variables. It is not possible to produce the desired crosstables, if the two or more required variables are not in the same registration. Data linkage techniques should be used to combine data from different registrations and surveys” [45, p. 3]. He introduced the micro-integration method but indicated that although the method had been widely used over the last two decades “authoritative literature is absent. The existing literature (e.g. Statistics Denmark, 1995; Al en Bakker, 2000; Schulte Nordholt, Hartgers en Gircour, 2004; Statistics Finland, 2004; CBS, 2006; Wallgren en Wallgren, 2007) are more or less descriptions of best practices and not based on a theoretical basis” [45, p. 3]. He explained the methodology as it is used, but does not provided any theoretical justification. “Micro- integration is the method that aims at improving the data quality in combined sources by searching and correcting for the errors on unit level, in such a way that:

  • the validity and reliability of the statistical outcomes are optimized,

  • one figure on one phenomenon is published,

  • variables from different sources can be combined and as such, source and theme exceeding outcomes can be published, and

  • accurate longitudinal outcomes can be published.

The term “error” in the definition should be understood in a broad sense. It also covers the differences in concepts and operationalization of these concepts in the integrated sources” [45, p. 4]. As a result “After the micro-integration process, all statistical output that is produced from the micro-integrated files is consistent” [45, p. 5]. The correction of errors in micro-integration is based on the concept of the total survey error during the life time of a survey as developed by Groves et al. [46] and he presented interesting models to show the feasibility of the correction of measurement and representational errors in the linkage process of registers and sample surveys.

In 2014 in the second (much modified) edition of their book “Register-based Statistics: Administrative Data for Statistical Purposes”, which is considered the standard reference on register-based statistics, Wallgren and Wallgren stated: “Although register-based statistics are a common form of statistics used for official statistics and business reports, no well-established theory in the field exists. There are no recognised terms or principles, which makes the development of register-based statistics and register-statistical methodology all the more difficult. As a consequence, ad hoc methods are used instead of methods based on a generally accepted theory.” They refer explicitly to the statistical inference and probability and sampling theory [47, p. 3]. The assertion that there is no well-established theory is only partially correct, because there is solid ground for the use of registers (administrative data in general) in the principles underlying the indirect observation as indicated in Section 2. This relates only to the variables contained in the register, and not the combination of registers (or registers and surveys) to create new variables. They do propose the creation of statistical information system in which all sources of statistics are included, and identify four principles, the transformation, system, consistency and quality principles [47, p. 3] and two preconditions, identity number and legal principle [47, p. 6] for the use of administrative registers in statistics. Using these principles they describe how registers data can be used in the production of statistics replacing direct data collection.

The implementation of register-based censuses has to comply with the same quality requirements as the census. With regard to the observation made on some key issues facing the census, the following observation can be made regards the register-based census:

  • In the register-based approach the population covered is the registered population and there is no procedure to establish or measure how well this covers the total population. The only external way of establishing completeness was the comparison with the census. Also there are no generally accepted standards for the inclusion of persons in the central population registers (CPR) in countries that use the register based approach. Especially for non-citizens the criteria for inclusion vary.

  • In the register-based approach the characteristics of individuals are not observed but they are constructed using already existing information in registers, or by combining several registers or registers with survey data.

  • Many registers are period registers and therefore have no reference date. The date of closure, or an arbitrary date, also called the census date, are used as substitute of the census date, but there is a fundamental difference between these two concepts. The reference date is also an important element in the combination of sample survey data or sample surveys and registers. The validity and reliability of data that result from such an operation are not assessed, and maybe cannot be assessed.

  • A related issue is the updating of the information in the registers. Updating means obtaining the information for all units of a register at a particular new date. In practice what is being done is that only mutations are processed, which means that the only changes made are including new units and deleting obsolete units, and if a particular characteristic is changed record the new value of that characteristic. All other units remain unchanged and it is supposed that the values of the previous moment are those at the new time.

  • The quality of the variables that are constructed through the micro-integration process cannot be assessed as all errors are corrected in the process.

  • There is no methodological basis for imputation of missing data in the registered-based approach.

  • Of the countries that used the register-based approach in the 2010 census round none has published an administrative report.

To overcome the lack of a theoretical or scientific basis in several countries efforts are being made to establish the validity of the results of the register-based procedures with other sources which are based on the scientific method. Validation of a single variable can be based on the comparison of the distributions of the variable in the register and in a sample survey. In such cases the value of the sample survey with its confidence intervals should be the criterion. In 2012 Bakker [48] reported an ingenious procedure of measuring the construct validity of a set of variables that are linked by a theoretical framework using a structural equations model. He used four variables, age, gender, educational attainment and hourly wages, linked in a simple earnings function model. He compared two sets of data one obtained from a sample survey and the other from registers. The educational attainment variable in the register data was a hybrid variable combining register derived and sample survey data. His findings for these two set of data were promising with both data set basically showing similar patterns, indicating that for this particular register-based data set the variables are valid measurements of the concepts they are meant to measure. However, he indicated that because the educational attainment variable was a hybrid, “its estimated validity could not be used for a conclusion on the quality of register data alone” [48, p. 17]. He thereby identified one of the key challenges of the data integration concept, the theoretical basis of the hybrid variable.

6.Nothing new under the sun and way forward

A professional man takes orders, in technical matters, from standards set by his professional colleagues as unseen judges; never from an administrative superior. … A professional statistician will not follow methods that are indefensible, merely to please someone, nor support inferences based on such methods. … He can be a trusted and respected public servant. Deming, (1966: 1885)

Complaints about the high costs of certain statistical procedures, especially direct observation, are not new. Neither are the efforts to replace direct observation by the use of already available information. Meitzen already mentioned this in his 1866 German study and it is as follows reported in the 1891 English translation:

“Enumeration by observation and summation of the units is, as a rule, a very extensive, tedious, and expensive operation, as most statistical inquiries are concerned with the affairs of peoples and nations. In every case the endeavour must be to make the solution as simple as possible. But it is often altogether impossible to attain the practical end in view, unless it can be done with small means, or in a limited period of time. Hence operations have been devised which can be used in a measure as substitutes for enumeration. Such substitutes are, however, always imperfect. They are based on the effort to use results already known, in order to dispense with the necessity of new investigations” [6, p. 121].

Historically, statistical information is obtained from administrative records (registers), censuses and since the second half of the 20th century sample surveys. Each of these sources has their advantages and disadvantages and was used, alone or in combination to produce the required information for the public administration. The interrelationship between the census and registers, especially in population statistics, dates from the beginning of the 19th century, when data from population registers and censuses were used to produce continuous annual population estimates, using procedures akin to the demographic balancing equation. It is interesting to note that countries started to discuss the substitution of the census with register-based data at the time when the sample surveys were being introduced in the national statistical systems. Using solely register-based information to produce statistics required an integrated register system with unique identification numbers that can be interrelated, which consists not only of good quality base registers but also of a number of auxiliary registers that cover the whole range of the data needs. This required a specific configuration for the national statistical system, which could be at variance with a system that used the three sources of statistical information. Denmark has abolished the census as a source of statistical information, and claims that all the data needs can be covered by registers and surveys [41, p. 45]. The issue is how do they independently measure the completeness of the register-based data sets?

The situation with regard to census procedures used in the 2010 round is as follows, the large majority of countries used a well-tested scientific census methodology, while a small number of mostly relative economically prosperous countries, used the register-based approach, which has no theoretical bases, and no standard procedures to produce census-like tabulations. The population and housing census, especially if it is cartography based, has well-established procedures to establish the completeness and quality of the collected information. It has a proven record of flexibility, in incorporating new technologies. It is capable of measuring emerging phenomena, produce data for the smallest areas required; it can enumerate populations with new or rare characteristics, and, produces intermediate and final products, including for small areas, which satisfy the needs of users [44]. The register-based census attempts to create primary data out of secondary sources or describe the present using the past. It does not observe and enumerate an objective observation unit, but it tries to create one by combining information from different registers and sample surveys. It is not an observation method but an attempt to create a person with his required characteristics. The efforts to replace the census are not based on scientific or philosophical deficiencies of the former, but because of technological developments and political, financial and social considerations. Some of the proponents of the register-based statistics state that “The existing literature (e.g. …) are more or less descriptions of best practices and not based on a theoretical basis” [45, p. 3] or “There are no recognised terms or principles, which makes the development of register-based statistics and register-statistical methodology all the more difficult” [47, p. 3]. In addition to the lack of a theoretical foundation the implementation of some of the procedures used also create a number of additional challenges such as the nonexistence of a standard methodology, issues related to the impossibility to establish the completeness of the coverage and the quality of the individual variables, and the lack of methods to establish the validity and reliability of the mixed or hybrid variables, see [48, p. 17].

The use of registers in official statistics creates a number of challenges for the statistical profession. In this case the challenges of the census in population statistics are of concern. There are two fundamentally different approaches of the census based on established and proved methodology and procedures and the register-based census for which there is no theoretical basis and which has a number of other methodological challenges. But it is said to be cheaper and does not create respondents’ burden. So there are two sets of approaches: one based on scientific principles and the other on practical considerations. The application of the register-based approach in the Nordic countries has been going on for 50 years or more. There have been continuous national and regional efforts to establish a theoretical basis of the approach. The situation at present is that it is known how to perform certain operations, and why they are done, they are economical and do not burden the heads of households, but not what their theoretical bases are. The census is not only an instrument for population statistics but it is a generalised method of data collection in use by wide variety of scientific disciplines. Hence it is time for the statistical or maybe even the scientific community to review the status of the register-based approach, but in particular the status of the register-based census.

Science is self-correcting; the issue now becomes who are the guardians of statistics as a science? Also, which statisticians are qualified to discuss the issues confronting their science? When Petter Jakob Bjerve, then Director of the Central Bureau of Statistics of Norway, made an impassioned presidential address at the 39th Session of the International Statistical Institute in Vienna in August 1973 to promote the cooperation among different types of statisticians he referred to three types of statisticians, persons who had an academic or professional training in statistics. He stated “members of the International Statistical Institute are engaged in the development and dissemination of statistical theory, in production and dissemination of statistical data, and, in application of subject-matter theory, statistical theory and data for analysis of subject-matter problems” [49, p. 13]. In those days the membership of the ISI was highly selective. However, currently a statistician is no more a person with specialised training in and knowledge of a certain branch of science when considering issues of professional ethics. According to the Preamble of the 2010 ISI Declaration on Professional Ethics: “For the purposes of this document, the definition of who is a statistician goes well beyond those with formal degrees in the field, to include a wide array of creators and users of statistical data and tools” [50]. This greatly expanded the coverage of the term “statistician”, so in this case of professional ethics the issue becomes who are the professional statisticians capable of assessing the methodology of the census? Or more generally, who are the guardians of the science of statistics and “will they be able to bell the cat”?

Notes

2 Since 1958 the United Nations issued guidelines containing international standards for population and housing census taking for each of the decennial census rounds. In 1958 and 1969 separate guidelines were issued for population and housing censuses. From 1980 onward the Principles and Recommendations dealt with the population and the housing census in the same publication. In the text reference is made to publication dealing with the population census unless otherwise stated. These publications will be referred to in abbreviated form as P&R followed by the year of publication, hence P&R 2008 refers to the Principles and recommendations for population and housing censuses, Revision 2 published in 2008.

Acknowledgments

The author is indebted to many people who have contributed to his understanding of the principles of science, research design and the practice of census taking across the globe. The Director of the General Bureau of Statistics (ABS) of Suriname, Drs. Iwan Sno, M Sc deserves special mention for his long standing confidence and continued dialogues on methodological matters, Colleagues of Statistics Canada and Statistics Norway, especially the staff of the library, have provided the author with relevant historical documents. Former colleagues and friends of the World Fertility Survey (WFS) have provided useful comments on a previous version of the paper. The author acknowledges with appreciation the very helpful comments of two anonymous reviewers. The author greatly appreciates the confidence of the editor of the special issue and the editor-in-chief.

References

[1] 

United Nations Statistical Commission. Forty-sixth session, 2010 and 2020 World population and housing census programmes, Report of the Secretary-General E/CN3/2015/6. 2015; Available at: http//unstats.un.org/unsd/statcom/doc15/2015-6-Censuses-E.pdf.

[2] 

United Nations Statistics Division. Report on the Results of a Survey on Census Methods used by Countries in the 2010 Census Round. Working Paper UNSD/DSSB/1. n.d, p9 -10. Available at: https//unstats.un.org/unsd/censuskb20/KnowledgebaseArticle10696.aspx.

[3] 

United Nations Economic Commission for Europe (UNECE). Measuring population and housing practices of UNECE countries in the 2010 round of censuses. New York and Geneva: United Nations, 2014.

[4] 

United Nations Department of Economic and Social Affairs, Statistics Division. Principles and recommendations for population and housing censuses, Revision 2. New York: United Nations, 2008.

[5] 

Meitzen A. Geschichte, Theorie, und Technik der Statistik. Berlin: Verlag von Wilhelm Hertz (Bessersche Buchhandlung), 1886.

[6] 

Meitzen A. History, theory, and technique of statistics. Philadelphia: American Academy of Political and Social Science: 1891. Available at: https//archive.org/details/historytheorytec00meitrich/page/n6.

[7] 

Groot AD. de Methodology: Foundations of inference and research in the behavioral sciences. The Hague, Paris. Mouton: 1969.

[8] 

Yule GU. An introduction to the theory of statistics. London. Charles Griffin and Company/Philadelphia. J.B. Lippincott Company: 1911.

[9] 

Deming WE. Principles of professional statistical practice Annals of Mathematical Statistics. December 1965; 36(6): 1883-1900.

[10] 

Verrijn Stuart CA. Inleiding tot de beoefening der statistiek. Eerste deel. De statistische methode en hare toepassing op het gebied der demografie. Haarlem: De Erven F Bohn: 1910.

[11] 

Gehrmann R. German census-taking before 1871 Rostock: 2009. Available at: https//www.demogr.mpg.de/papers/working/wp-2009-023.pdf.

[12] 

Levi L. Resumé of the Statistical Congress, held at Brussels, September 11th, 1853, or the purpose of introducing unity in the statistical documents of all countries. Journal of the Statistical Society of London. 1854; 17(1): 1. Available at: doi: 10.2307/2338350.

[13] 

The National Archives, London. Domesday: Britain’s finest treasure. Available at http://www.nationalarchives.gov.uk/domesday/.

[14] 

Anonyme, Méthode generalle et facile pour faire le dénombrement des peuples. Paris: l’Imprimerie de la Veuve d’Antoine Chrestien: 1886. Available at: https//gallica.bnf.fr/ark:/12148/bpt6k5579078w.texteImage.

[15] 

Esmonin E. Quelques données inédites sur Vauban et les premiers recensements de population. Population. 1954; 9(3): 507-512. Doi: 10.2307/1524558. Available at: https//www.persee.fr/doc/pop_0032-4663_1954_num_9_3_3274.

[16] 

Encyclopaedia Britannica. Quimby R S. Sébastien Le Prestre de Vauban. Available at: https//www.britannica.com/biography/Sebastien-Le-Prestre-de-Vauban#ref7606.

[17] 

Vauban, La dime royale, Paris: Librairie de la Bibliothèque Nationale: 1897. Available at: https://gallicabnf.fr/ark:/12148/bpt6k5727570s.texteImage.

[18] 

Vilquin É. Vauban, inventeur des recensements. Annales de démographie Historique. 1975; 207-257. https//www.persee.fr/doc/adh_0066-2062_1975_num_1975_1_1282.

[19] 

Sulte B, Histoire Canadiens-Français 1608-1880. Origine, histoire, religion, guerres, découvertes, colonisation, coutume, Tome IV Chapitre IV. Montréal, Wilson & Cie, Editeurs: 89 Rue Saint-Jacques, 89: 1882. Available at: https//archive.org/details/HistoireDesCanadiens-franais4.

[20] 

The 1703 Census of Iceland. Available at: http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/CI/CI/pdf/mow/iceland_1703_census.pdf.

[21] 

Status animarum of Münster 1749/50. Available at https://www.mgen.de/presse/586-status-animarum-von-174950.

[22] 

0-Status Animarum. Available at http://datamatricula-online.eu/de/deutschland/muenster/status-animarum/.

[23] 

Ministre de l’Intérieur, Recensement général (15 Octobre de 1846). Bruxelles: 1849.

[24] 

Brown S. Report on the Eighth International Statistical Congress, held at St. Petersburg, 22nd/10th August to 29th/17th, 1872. Journal of the Statistical Society of London. 1872; 35(4): 431. Doi: 10.2307/2338764.

[25] 

van Maarseveen J. Dutch Occupational Censuses 1849-1971/2001. A component of the Population Census. 9 Febr. 2008. Voorburg, The Netherlands: Central Bureau of Statistics, Available at: http//www.volkstellingen.nl/nl/relevant/jacquesm2/index.doc.

[26] 

Schulte Nordholt, E. The usability of administrative data for register-based censuses. Statistical Journal of the IAOS. 2018; 34(4): 1-12. DOI 10.3233/SJI-180425.

[27] 

De Neumann Spallart MFX. La fondation de l’Institut International de la Statistique: Aperçu historique. Bulletin de l’Institut International de Statistique. 1886; 1(1-2); 1-34. Available at http//gallica.bnf.fr/ark:/12148/bpt6k615433?rk=42918;4.

[28] 

United Nations Department of Economic and Social Affairs, Statistics Division. Post Enumeration Surveys: Operational guidelines; Technical Report. United Nations: New York: 2010.

[29] 

United Nations. Population census methods. United Nations: Lake Success, New York: 1949.

[30] 

United Nations Statistical Office. Principles and Recommendations for National Population Censuses. Statistical Papers Series M No.27, United Nations: New York: 1958.

[31] 

United Nations Statistical Office. General Principles for a Housing Census. Statistical Papers Series M No. 28. United Nations: New York: 1958.

[32] 

United Nations Department of Economic and Social Affairs, Statistical Office. Principles and Recommendations for the 1970 Population Censuses, (Second printing with changes of a non-substantive nature. Statistical Papers Series M No.44.; United Nations: New York: 1969.

[33] 

United Nations Department of Economic and Social Affairs, Statistical Office. Principles and Recommendations for the 1970 Housing Censuses (Second printing with changes of a non-substantive nature. Statistical Papers Series M No. 45. United Nations: New York: 1969.

[34] 

United Nations Department of Economic and Social Affairs, Statistical Office. Principles and recommendations for population and housing censuses. Statistical Papers Series M No. 67. United Nations: New York: 1980.

[35] 

United Nations Department of Economic and Social Affairs, Statistical Office. Supplementary Principles and Recommendations for Population and Housing Censuses. Statistical Papers Series M No.67/Add. 1. United Nations: New York: 1990.

[36] 

United Nations Department of Economic and Social Affairs, Statistics Division. Principles and Recommendations for Population and Housing Censuses, Revision 1. Statistical Papers Series M No.67/Rev.; 1. United Nations: New York, 1997.

[37] 

United Nations Department of Economic and Social Affairs, Statistics Division. Principles and recommendations for population and housing censuses, Revision 3. Statistical Papers Series M No. 67/Rev. 3. United Nations: New York, 2017.

[38] 

Nations Unies, Département des affaires économiques et sociales, Division de statistique. Principes et recommandations concernant les recensements de la population et des logements. Deuxième révision. Nations Unies: New York, 2009.

[39] 

Government Digital Service. Downey P. The characteristics of a register. 13 October 2015. Available at https//gds.blog.gov.uk/2015/10/13/the-characteristics-of-a-register/.

[40] 

U.S. Bureau of Labor Statistics. National Compensation Measures: Concepts. Available at https://www.bls.gov/opub/hom/ncs/concepts.htm.

[41] 

Lange A. The population and housing census in a register based statistical system. Statistical Journal of the IAOS. 2014; 30: 41-45. Doi 10.3233/SJI-140798.

[42] 

Organisation for Economic Co-operation and Development (OECD). Glossary of Statistical Terms. Paris: 2007. Available at: http//stats.oecd.org/glossary.

[43] 

Everitt BS, Skrondal A. Cambridge Dictionary of Statistics, 4th edition. Cambridge University Press: Cambridge, 2010.

[44] 

MacDonald AL. The population census: counting people because they count. In Macfarlane SB, Abou Zahr C, editors. The Palgrave Handbook of Global Health Data Methods for Policy and Practice. 2019; 105-123. Doi: 10.1057/978-1-137-54984-6_6.

[45] 

United Nations Economic Commission for Europe. Conference of European Statisticians. Statistical Office. Joint UNECE/Eurostat Expert Group Meeting on Register-Based Censuses (The Hague, The Netherlands, 10–11 May 2010. Statistics Netherlands. Micro-integration: State of the Art. Working paper 10. 12 May 2010.

[46] 

Groves RM, Fowler FJ, Jr., Couper M, Lepkowski JM, Singer E, Tourangeau R. Survey Methodology. New York: Wiley Interscience, 2004.

[47] 

Wallgren A, Wallgren B. Register-based statistics. Statistical methods for administrative data. Second edition, Chichester, UK: John Wiley & Sons Ltd: 2014.

[48] 

Bakker, BFM. Estimating the validity of administrative variables. Statistica Neerlandica. 2012; 66(1): 8-17.

[49] 

Bjerve PJ. Co-operation among Statisticians. Central Bureau of Statistics, Norway. Two addresses on statistical cooperation: Oslo. 1976; 11-19.

[50] 

International Statistical Institute. Declaration on professional ethics. Reykjavik: 2010; Available at: https//www.isi-web.org/images/about/Declaration-EN2010.pdf.