Affiliations: School of Mathematical and Physical Sciences, University of Newcastle, University Drive, Callaghan NSW, Australia
Corresponding author: Duy Tran, School of Mathematical and Physical Sciences, University of Newcastle, University Drive, Callaghan NSW, Australia. E-mail: [email protected]
Abstract: Data aggregation often occurs due to data collection methods or confidentiality laws imposed by government and institutional organisations. This kind of practice is carried out to ensure that an individual’s privacy is protected but it results in selective information being distributed. In this case, the availability of only aggregate data makes it difficult to draw conclusions about the association between categorical variables. This issue lies at the heart of Ecological Inference (EI) and is of growing concern for data analysts, especially for those dealing with the aggregate analysis of a single, or multiple, 2 × 2 contingency tables. Currently, there are a number of EI approaches that are available and provide the analyst with tools to analyse aggregated data but their success has been mixed due to the variety of assumptions that are made about the individual level data, or the models that are developed to analyse them. As an alternative to ecological inference, one may consider the Aggregate Association Index (AAI). This index gives the analyst an indication of the likely association structure between two categorical variables of a single 2 × 2 contingency table when the individual level, or joint frequency/proportion, data is unknown. To date, the AAI has been developed for the analysis of a single 2 × 2 table. Hence, the purpose of this paper is to extend the application of the AAI to the case where aggregated data from multiple 2 × 2 tables (i.e. stratified 2 × 2 tables) require analysis. To illustrate this new extension of the AAI, New Zealand voting data in 1893 is studied with the focus on gender. This data comprises of fifty-five electorates where the data available consists of the marginal information of a 2 × 2 table. The importance of this New Zealand voting data is that it was in this 1893 election where gender equality in voting at a national level was recognised for the first time in the world.
Keywords: 2 × 2 tables, aggregate data, marginal information, ecological inference, Aggregate Association Index