You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Synthetic establishment microdata around the world


In contrast to the many public-use microdata samples available for individual and household data from many statistical agencies around the world, there are virtually no establishment or firm microdata available. In large part, this difficulty in providing access to business microdata is due to the skewed and sparse distributions that characterize business data. Synthetic data are simulated data generated from statistical models. We organized sessions at the 2015 World Statistical Congress and the 2015 Joint Statistical Meetings, highlighting work on synthetic \emph{establishment} microdata. This overview situates those papers, published in this issue, within the broader literature.



Little R.J.A., Statistical Analysis of Masked Data, Journal of Official Statistics 9(2) (1993), 407-426.


Rubin D.B., Discussion of Statistical Disclosure Limitation, Journal of Official Statistics 9(2) (1993), 461-468.


Drechsler J., Synthetic Datasets for Statistical Disclosure Control-Theory and Implementation, New York: Springer, 2011.


Bender S., The {RDC} of the {Federal Employment Agency} as a part of the {German} {RDC} Movement, In: Comparative Analysis of Enterprise Data, 2009 Conference, 2009. Available from: {}.


Vilhuber L., Methods for Protecting the Confidentiality of Firm-Level Data: {I}ssues and Solutions, Labor Dynamics Institute, 2013, 19. Available from: {http://digitalcommons.ilr.cor}.


Abowd J.M., and Lane J.I., New Approaches to Confidentiality Protection Synthetic Data, Remote Access and Research Data Centers, in: Privacy in Statistical Databases, 2004, pp. 282-289. Available from: {\linebreak book/9783540221180}.


Abowd J.M., and Schmutte I., Economic analysis and statistical disclosure limitation. Brookings Papers on Economic Activity. Fall 2015. Available from: { about/projects/bpea/papers/2015/economic-analysis-statistic al-disclosure-limitation}.


Miranda J., and Vilhuber L., Looking Back On Three Years Of Using The Synthetic LBD Beta. Statistical Journal of the IAOS: Journal of the International Association for Official Statistics. 2014, 30. Available from:


Drechsler J., and Vilhuber L., A First Step Towards A {German} {SynLBD}: {C}onstructing A {G}erman {L}ongitudinal {B}usiness {D}atabase, Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2) (2014), Available from: { V18331Q33150}.


Jarmin R.S., , Louis T.A., and Miranda J., Expanding The Role Of Synthetic Data At The U.S. Census Bureau, Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2) (2014), Available from: {http://ios 98bf2f4701ae806ee638594915&pi=0}.


Kinney S.K., , Reiter J.P., and Miranda J., Improving The Synthetic Longitudinal Business Database, Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2) (2014).


Abowd J.M., Synthetic establishment data: Origins and introduction to current research, Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2) (2014). Available from: {http://iospress.metapress. com/content/76707M55W510VT16}.


Jarmin R., and Miranda J., The {L}ongitudinal {B}usiness {D}atabase. U.S. Census Bureau, Center for Economic Studies; 2002. CES-WP-02-17.


{U S Census Bureau}, Longitudinal Business Database ({LBD}). Washington, DC USA: {U.S. Census Bureau} [distributor]; 2012. Available from: { cts/ datasets/lbd.html}.


Kinney S.K., , Reiter J.P., , Reznek A.P., , Miranda J., , Jarmin R.S., and Abowd J.M., Towards Unrestricted Public Use Business Microdata: The {S}ynthetic {L}ongitudinal {B}usiness {D}atabase, International Statistical Review 79(3) (2011), 362-384. Available from: { i3p362-384.html}.


Abowd J.M., and Vilhuber L., Synthetic Data Server; 2010. Available from: {}.


Drechsler J., Synthetische {S}cientific-Use-Files der {W}elle 2007 des {IAB}-{B}etriebspanels, Institute for Employment Research, Nuremberg, Germany; 2011. 201101_de. Available from: {}.


Haltiwanger J.C., , Jarmin R.S., and Miranda J., Who Creates Jobs? Small vs. Large vs. Young. National Bureau of Economic Research, Inc; 2010. 16300. Available from: {https://}.


Miranda J., and Vilhuber L., Using Partially Synthetic Data to Replace Suppression in the Business Dynamics Statistics: Early Results, in: Privacy in Statistical Databases, J. Domingo-Ferrer, ed., vol. 8744 of Lecture Notes in Computer Science. Springer International Publishing}}; 2014, pp. 232-242. Available from: {}.


Meng X.L., Multiple-imputation inferences with uncongenial sources of input, Statistical Sciences 9(4) (1994), 538-573.


Abowd J.M., , Stephens B.E., , Vilhuber L., , Andersson F., , McKinney K.L., , Roemer M. et al., The {LEHD} Infrastructure Files and the Creation of the {Q}uarterly {W}orkforce {I}ndicators, in: T. Dunne, J.B. Jensen, M.J. Roberts, eds, Producer Dynamics: New Evidence from Micro Data. University of Chicago Press, 2009.


Hyatt H., , McEntarfer E., , McKinney K., , Tibbets S., , Vilhuber L., and Walton D., Job-to-{J}ob {F}lows: {N}ew Statistics on \linebreak Worker Reallocation and Job Turnover, U.S. Census Bureau; 2015. Available from: { job_documentation_long.pdf}.


Drechsler J., and Reiter J.P., Sampling With Synthesis: A New Approach for Releasing Public Use Census Microdata, Journal of the American Statistical Association 105(492) (2010), 1347-1357. Available from: { bes/jnlasa/v105i492y2010p1347-1357.html}.


Hu J., , Reiter J.P., and Wang Q., Dirichlet Process Mixture Models for Nested Categorical Data, ArXiv e-prints. 2014 Dec.


Abowd J.M., , Stinson M., and Benedetto G., Final Report to the {Social Security Administration} on the {SIPP/SSA/IRS} {Public} {Use} {File} {Project}, U.S. Census Bureau; 2006. Available from: {}.


Reiter J.P., , Oganian A., and Karr A.F., Verification servers: Enabling analysts to assess the quality of inferences from public use data, Computational Statistics & Data Analysis 53(4) (2009), 1475-1482. Available from: { 10.1016/j.csda.2008.10.006}.