Infectious diseases can inflict immense losses and suffering on the human population. As at 23 of June, 2020 COVID-19 pandemic has caused 20,919 cases, 25 deaths and 7,109 had recovered in Nigeria. Nigeria Centre for Disease Control (NCDC) is tracing COVID 19 carriers for designing effective control measures and to prevent the spread. Authors have modeled COVID-19 cases, but there is a dearth of information on estimating the total number of hidden COVID-19 carriers in the population. Adaptive cluster sample was used for exploring populations of hidden COVID-19 carriers. The data on daily cases of COVID-19 were extracted from NCDC website. Nigeria population was partitioned into 37 regions (states and FCT). We considered a model based approached in Bayesian framework to make inference about the number of COVID-19 carriers in Nigeria. The fitted model showed that all COVID-19 carriers will only be captured at once if contact tracing is combined with methodology designed in this work.
The Coronavirus Disease 2019 (COVID-19) is an emerging respiratory infectious disease caused by SARS-CoV-2 (also known as 2019-nCoV), which first occurred in early December 2019 in Wuhan, China. According to the , COVID-19 situation report of 23 June 2020, there were 9,228,526 confirmed cases and 425,123 deaths in 216 countries, areas or territories. The pandemic is accelerating in Africa, it took 98 days to reach 100,000 cases and only 18 days to move to 200,000 cases. Ten out of 54 countries in Africa are currently driving the rise in numbers, accounting for nearly 80% of all the cases. More than 70% of the deaths are taking place in only five countries: Algeria, Egypt, Nigeria, South Africa and Sudan. World Health Organisation (WHO) and other partners have supported governments with scaling up health workforce and laboratory capacities and to set up points-of-entry screening at airports and border crossings. These public health and social measures have been effective in slowing the spread of COVID-19 in Africa. The first COVID-19 confirmed case in Nigeria was reported on 27th February 2020, when an Italian citizen in Lagos tested positive for the virus . The second case was recorded on the 9th March 2020 in Ewekoro, Ogun State, a Nigerian citizen who had contact with the Italian citizen was reported by . Within the first month, the confirmed cases were around 70 but drastically increased to number almost 1,350 cases before the end of the second month and the discharged cases increased from 3 to about 250. In Figs 1–3 there was a sharp increase in the cases towards the end of April. As at 23, June 2020  reported the number of cases is 20,919, 525 deaths and 7,109 had recovered.
Adaptive Cluster Sampling (ACS) is a sampling design that target hidden events that are spatially inconsistent, and can be used to collect valuable information with respect to temporal and spatial characteristics of individuals and areas  reviewed in .
Adaptive designs can offer significant ethical and cost advantages over standard fixed techniques. Naturally, adaptive sampling scheme maximizes survey effort where it is most valuable and minimizes survey effort elsewhere. It performs better for sparse and clustered populations than standard grid cell sampling methods.
Authors have modeled COVID-19 cases, but there is a dearth of information on estimating the total number of hidden COVID-19 carriers in Nigeria. Adaptive cluster sample was used for exploring populations of hidden COVID-19 carriers in Nigeria. The data on daily cases of COVID-19 were extracted from NCDC website. Nigeria population was partitioned into 37 regions (36 states and FCT). We considered a model based approach in Bayesian framework to estimate the number of COVID-19 carriers in Nigeria.
Let Nigeria population (R), be defined to consist of 774 local government areas (LGA). Then, , such that all LGA are assigned to regions (36 states and the Federal Capital Territory (FCT)), based on the vector of parameters according to a multinomial distribution, such that is the probability of a COVID-19 carrier being assigned to region .
Considering that the COVID-19 carriers are sparse and clustered, let controls the expected number of COVID-19 carriers in a region, controls the conditional expected number of regions with COVID-19 carriers and be fixed arbitrarily, to be expected number of confirmed COVID-19 cases in each region.
Based on the above prior knowledge, we considered a Bayesian-based model, the prior distribution of and , the prior distribution of is the product of two Gamma distributions. is equal to the density of Gamma with , also follows Gamma distribution as given by .
2.1Region inclusion probability
Suppose Nigeria population U with COVID-19 carriers is partitioned into regions (36 States and FCT), then 37 regions, in line with the methodology proposed by [5, 8, 9, 10, 6]. Assume each region can be assigned a value 1 or 0, if there is presence or absence of COVID-19 carriers upon being sampled. Also, the 1 regions are spatially clustered and 0 regions are left blank. Let denote the numbers of COVID-19 carriers associated with region .
Assume 37 regions are selected from the 37 regions via a sampling design, and the presence/absence value v is recorded. Also (i) whenever 1 for a sampled region all neighboring regions will subsequently be added to the sample and (ii) no additional sample will occur in the neighbourhoods of 0 sampled units. Although a neighbourhood can be define in many ways, but neighbourhood of a region represent itself and its nearest neighbours to the left, right, top and down. This is an example of an Adaptive cluster sampling design in .
2.1.1Region inclusion probabilities based on an initial simple random sample
The network inclusion probabilities based on an initial simple random sample (SRS) will be as follows.
Let be define as
For a region of size with COVID-19 carriers and non COVID-19 carriers, is the hypergeometric probability of not selecting any of the t units of the first type in a sample of size . Therefore for region of size the inclusion probability is
When including network of size and network of size the inclusion probability requires the use of the principle of inclusion/exclusion.
2.1.2A model for number of COVID-19 carriers in 36 states and FCT of Nigeria
We considered Nigeria population () which contains sparse and clustered population of COVID-19 carriers. The population was divided into 37 (36 states and Federal Capital Territory (FCT)) regions. These regions were bounded and the boundaries were considered as grid cell (region). A region is nonempty if it contains at least one COVID-19 carrier and empty otherwise. Let be the number of regions with COVID-19 carriers in . Let be the number of COVID-19 networks (collection of neighboring regions) in the Nigeria population and denote the number of regions with COVID-19 carriers with each networks so that . As there are regions without COVID-19 which are defined to be networks of regions without COVID-19 of size one, therefore, there are networks in
The variables describe the number of regions in each COVID-19 carrier network. It is then convenient to extend the -vector Y to the (- vector of dimension where is the vector of networks without COVID-19 carriers, with dimension . by defining for each network of regions with COVID-19 carriers and Zi 1 for each network of regions without COVID-19 carriers. Note that and contain the same information about the networks. Let denote the number of COVID-19 carriers in each network. are regions with COVID-19 carriers, so it convenient to write where is the number of Networks with COVID-19 carriers and is the () vector of zeros (Network without COVID-19 carriers). Therefore, inference was made about N (the total number of COVID-19 carriers in R (Nigeria)).
As June 22, 2020, 35 (states with COVID-19 carriers in Nigeria), and 1 (states without COVID-19 carriers in Nigeria; Cross River state). We constructed a model for the number of network in R by specifying the joint distribution of , , Y and N for the entire population and the sampling mechanism which leads to a particular sample of out of networks. The sampling mechanism describes the selection of networks, so can only be defined conditionally on the network structure described by , and . Then, we specified the joint distribution of , , and for . We first model the structure of network with/without COVID-19 carriers (determined by , and ) and then, conditionally on the network structure, model the number of COVID-19 carriers (N) networks of regions with COVID-19 carriers. The sums we need to evaluate to obtain the sample level likelihood are too complicated to be evaluated analytically. We have therefore chosen to use a Markov Chain Monte-Carlo (MCMC) approach to estimate the unknown parameters , and in the model and the unobserved quantities , , and . We used Bayesian to obtain a natural predictive distribution for predicting , the natural predictor of is given as
2.1.3Fitting model for COVID-19 Carriers in Nigeria
We have three unknown parameters , and which are apriori independent and their marginal distributions were denoted by and respectively, the joint distribution of all the quantities in the model (including the parameters) is
|Day||Samples tested||Confirmed COVID-19 cases in nigeria||Estimated COVID-19 carriers from the model in nigeria||Samples to be tested||Remaining COVID-19 carriers|
Gibbs sampler was used to fit the joint distributions with the following steps;
i. State initial values for the unsampled components , and
ii. Generate from the conditional distribution
iii. Generate from the conditional distribution
iv. Generate from the conditional distribution
v. Generate ( from the conditional distribution
vi. Iterate from ii.
The R package was used to fit the models and Winbugs by  software was also used to implement the MCMC algorithm.
The main goal of MCMC is to gather a representation of the true but incalculable posterior distribution. The diagnostics give confidence that samples are converging to a reasonable representation of the targeted posterior. In this study, Trace plot, correlogram, Geweke, Gelman-Rubin and Raftery-Lewis were used for the diagnostics check.
This is useful to check to get how correlated the proposed parameters are through the iterations, there is need to plot the autocorrelation of the chains. It will give the sense of the efficiency of the MCMC algorithm.
2.1.7Effective sample size
The effective sample size in this context is defined as the appropriate number of people to be tested for COVID-19 that could give actual estimate of total number of COVID-19 carriers in Nigeria. The effective sample size, follows directly from correlogram as
where is the autocorrelation at lag .
3.Results and discussion
The fitted model showed that the sample were converged to reasonable representation of the target posterior (Fig. 6). From diagnostics check, the trace plots showed that the chains were mixed extremely fast and concentrated around the mode of the posterior distribution and also stationary as shown Figs 5 and 6. Plotting the Autocorrelogram in Fig. 4, the three chains autocorrelation is quite similar at each lag.
Table 1 shows the number of tested individual for COVID-19, the confirmed COVID-19 cases, estimated total of hidden COVID-19 carriers in Nigeria, the appropriate number of individual to be tested in Nigeria such that all carriers would be in the selected samples and the remaining number of unidentified COVID-19 carriers in Nigeria for days 1 May 2020, 15 May 2020, 1 June 2020, 15 June 2020 and18 June 2020. It was observed that for all the considered days confirmed cases were lesser than the estimated total COVID-19 carriers in the population. This implies that the contact tracing approach used by the NCDC may not be sufficient to identify all COVID-19 carriers in the population at once. The contact tracing approach needs to be combined with appropriate sampling techniques for Nigeria government to be able to reduce the spread of COVID-19. In May 1 2020, NCDC confirmed 238 while 289 COVID-19 carriers were estimated with 51 cases still remaining in the population unidentified thereby continuing the spread of the virus. For 15 May, 2020 confirmed COVID-19 carriers were 288 while 328 COVID-19 carriers were estimated with 40 cases still remaining in the population unidentified. For 1 June, 2020 confirmed COVID-19 carriers were 416 while 451 COVID-19 carriers were estimated with 35 cases still remaining in the population. For 15 June, 2020 confirmed COVID-19 carriers were 573 while 632 COVID-19 carriers were estimated with 59 cases still remaining in the population. For 18 June, 2020 confirmed COVID-19 carriers were 745 while 781 COVID-19 carriers were estimated with 36 cases still remaining in the population (Table 1). All unidentified carriers for each of the days were responsible for further spreading of the virus.
The relevance of the estimation is to guide statisticians to know the estimated number of COVID-19 carriers within a population together with appropriate sample to be selected strategically and adaptively such that all carriers would be captured. Adaptive Cluster Design is a specialized sampling procedure for identifying hidden events such as COVID-19 carriers in a population. The proposed approach and contact tracing can work concretely by involving Statisticians in the contact tracing team. Statisticians will use their professional skills to handle all challenges of not at home, hard to reach, non- response due to sensitive issues, hard core, lack of knowledge of the estimated number of COVID-19 carriers and lack of appropriate knowledge and strategy by medical personnel for identify Covid-19 carriers during contact tracing.
We have developed a model for estimating total numbers of COVID-19 carriers in Nigeria. The fitted model shows that all COVID-19 carriers will only be captured at once if contact tracing is combined with methodology designed in this work. The existing contact tracing team lacks involvement of Statisticians. This work recommends involvement of Statisticians in contact tracing team so as to handle all challenges of not at home, hard to reach, non- response due to sensitive issues, hard core, lack of knowledge of the estimated number of COVID-19 carriers and lack of appropriate knowledge and strategy by medical personnel for identify Covid-19 carriers during contact tracing.
We thank the National Center for Disease Control in Nigeria for making number cases of COVID-19 patients available online. We also acknowledge the reviewers for their constructive corrections.
World Health Organization. Novel coronavirus (2019-nCoV) situation report-41. https://wwwwhoint/docs/default-source/coronaviruse/situation-reports/20200301-sitrep-41-covid-19pdf?sfvrsn=6768306d_2 2020. Accessed on June 23, 2020 2, 2020.
MacLean OA, Orton R, Singer JB, Robertson DL Response to “On the origin and continuing evolution of SARS-CoV-2”. http://virological.org/t/response-to-on-the-origin–and-continuing-evolution-of-sars-cov-2020. 2/418.
Wiki. http://en.wikipedia.org/wiki/2020 coronavirus pandemics in Nigeria. Accessed June 23, 2020.
NCDC. https://covid19.ncdc.gov.ng. Accessed June 18, 2020.
Thompson, S. Adaptive cluster sampling. Journal of the American Statistical Association; 1990. 85: 1050-1059.
Borkowski J. Network inclusion probabilities and Horvitz-Thompson estimation for adaptive simple Latin square sampling. Environmental and Ecological Statistics; 1999. 6: 291-311.
Kelly Cristina MG Fernando ASM A mixture model for rare and clustered populations under adaptive cluster sampling. Bayesian Analysis; 2016. 11(2), 519-544.
Thompson S. Adaptive cluster sampling: Designs with primary and secondary units. Biometrics; 1991a. 47: 1103-1115.
Thompson S. Stratified adaptive cluster sampling. Biometrika; 1991b, 78: 389-397.
Thompson S, Seber G. Adaptive Sampling. New York: John Wiley & Sons, 1996.
Lunn DJ, Thomas A, Best N, Spiegelhalter D. Winbugs-a bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing; 2000: 10(4): 325-337.
Worldometer. https://www.worldometers.info/coronavirus/#countries. Accessed on June 23, 2020 2.