Affiliations: Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Starkville, MS, USA | Department of Computer Science, Queens College/City University of New York, New York, NY, USA | Division of Biostatistics, Department of Medicine, School of Medicine, Indiana University, Bloomington, IN, USA
Note:  Corresponding author: Xiu-Feng Wan, Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Starkville, MS, USA. Tel.: +1 662 325 3559; Fax: +1 662 325 3884; E-mail: [email protected] or [email protected]
Abstract: Influenza A viruses have been responsible for large losses of lives around the world and continue to present a great public health challenge. In April 2009, a novel swine-origin H1N1 virus emerged in North America and caused the first pandemic of the 21st century. Toward the end of 2009, two waves of outbreaks occurred, and then the disease moderated. It will be critical to understand how this novel pandemic virus invaded and adapted to a human population. To understand the molecular dynamics and evolution in this pandemic H1N1 virus, we applied an Expectation-Maximization algorithm to estimate the Gaussian mixture in the genetic population of the hemagglutinin (HA) gene of these H1N1 viruses from April of 2009 to January of 2010 and compared them with the viruses that cause seasonal H1N1 influenza. Our results show that, after it was introduced to human population, the 2009 H1N1 viral HA gene changed its population structure from a single Gaussian distribution to two major Gaussian distributions. The breadths of HA genetic diversity of 2009 H1N1 virus also increased from the first wave to the second wave of this pandemic. Phylogenetic analyses demonstrated that only certain HA sublineages of 2009 H1N1 viruses were able to circulate throughout the pandemic period. In contrast, the influenza HA population structure of seasonal H1N1 virus was relatively stable, and the breadth of HA genetic diversity within a single season population remained similar. This study revealed an evolutionary mechanism for a novel pandemic virus. After the virus is introduced to human population, the influenza virus would expand their molecular diversity through both random mutations (genetic drift) and selections. Eventually, multiple levels of hierarchical Gaussian distributions will replace the earlier single distribution. An evolutionary model for pandemic H1N1 influenza A virus was proposed and demonstrated with a simulation.
Keywords: Mixture model analysis, influenza A virus, influenza pandemic, Expectation-Maximization, population structure, hemagglutinin