Affiliations: [a] right.basedonscience, Frankfurt 60314, Germany | [b] Faculty of Automotive Engineering and Production, Cologne University of Technology, Arts and Sciences, Cologne 50679, Germany | [c] Bristol Business School, University of the West of England Bristol, Bristol, UK
Corresponding author: Felix Ritchie, Bristol Business School, University of the West of England Bristol, Coldharbour Lane, Bristol BS16 1QY, UK. Tel.: +44 117 32 81319; E-mail: [email protected].
Abstract: When producing anonymised microdata for research, national statistics institutes (NSIs) identify a number of ‘risk scenarios’ of how intruders might seek to attack a confidential dataset. This approach has been criticised for focusing on data protection only without sufficient reference to other aspects of confidentiality management, and for emphasising theoretical possibilities rather than evidence-based attacks. An alternative ‘user-centred’ approach offers more efficient outcomes and is more in tune with the spirit of data protection legislation, as well as the letter. The user-centred approach has been successfully adopted in controlled research facilities. However, it has not been systematically applied beyond these specialist facilities. This paper shows how the same approach can be applied to distributed data with limited NSI control. It describes the creation of a scientific use file (SUF) for business microdata, traditionally hard to protect. This case study demonstrates that an alternative perspective can have dramatically different outcomes as compared with established anonymization strategies; in the case study discussed, the alternative approach reduces 100% perturbation of continuous variables to under 1%. The paper also considers the implications for future developments in official statistics, such as administrative data and ‘big data’.
Keywords: Statistical disclosure control, risk management, statistical data confidentiality, data anonymisation