You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Hyperparameters optimization for ResNet and Xception in the purpose of diagnosing COVID-19

Abstract

COVID-19 has been considered as a global pandemic. Recently, researchers are using deep learning networks for medical diseases’ diagnosis. Some of these researches focuses on optimizing deep learning neural networks for enhancing the network accuracy. Optimizing the Convolutional Neural Network includes testing various networks which are obtained through manually configuring their hyperparameters, then the configuration with the highest accuracy is implemented. Each time a different database is used, a different combination of the hyperparameters is required. This paper introduces two COVID-19 diagnosing systems using both Residual Network and Xception Network optimized by random search in the purpose of finding optimal models that give better diagnosis rates for COVID-19. The proposed systems showed that hyperparameters tuning for the ResNet and the Xception Net using random search optimization give more accurate results than other techniques with accuracies 99.27536% and 100 % respectively. We can conclude that hyperparameters tuning using random search optimization for either the tuned Residual Network or the tuned Xception Network gives better accuracies than other techniques diagnosing COVID-19.

1Introduction

The spread of the COVID-19 around the world has quarantined many people and affected many industries, which has had a negative effect on people’s life and countries’ economies.

Chest radiography (X-ray) is one of the most popular methods that is used for pneumonia diagnosis. It is cheap and fast diagnosis method [1]. Chest X-ray gives a lower radiation dose to the patient than the magnetic resonance imaging (MRI) and the computed tomography (CT). However, correct diagnosis of X-ray images, it requires experts knowledge [1]. There are certain patterns that experts look out for in the chest image which is the common potential findings that give high confidence for COVID-19 presence. These include the perilobular pattern, ground –glass opacity (GGO) ±crazy –paving and consolidation, air bronchograms, and the reverse halo [2].

Despite the effectiveness of chest X-ray images obtained using radiology imaging techniques, it is difficult for a relatively small number of expert physicians to provide accurate evaluation for large number of X-ray images. Thus, there is a significant necessity for developing Computed Aided Diagnosis (CAD) systems that can help radiologists in improving the accuracy of X-ray interpretation. With the advances of deep learning techniques employed in CAD systems, radiologists can enhance their sensitivity for diagnosis by 10%. Due to the unknown characteristics of novel coronavirus 2019 (COVID-2019), it is very critical to develop an efficient system for detecting the positive cases as early as possible, which allows preventing its further spread and treating quickly the affected patients.

Various techniques were investigated for medical imaging such as multi-criteria decision-making and entropy optimization models [3, 4]. Several studies examined distinct machine/deep learning methods for medical imaging diagnosis [5–7]. Deep learning has gained remarkable success in computer vision for medical imaging [8]. Convolutional neural networks (CNNs) are the most popular network in computer vision. The CNN has been applied to many medical images’ classification problems due to its successful extraction of image features [9]. There are several types of deep CNNs including: Visual Geometry Group Network (VGG-Net) [10]; Residual Network (ResNet) [11]; Dense Convolutional Network(DenseNet) [12]; Inception [13] and Xception [14].

VGG-Net is a pre-trained model using ImageNet dataset. Due to the good generalization performance of VGG-Net, it can improve the classification accuracy. There are two versions of VGG Net: VGG16 and VGG19, and they differ in depths and layers where VGG19 is deeper than VGG16. The VGG-16 network has thirteen convolutional layer and three fully-connected layers, while the VGG-19 network has sixteen convolutional layer and three fully-connected layers. The implementation of VGG19 is more expensive than VGG16 to train the network [15].

The DenseNet network has many benefits asit cansolve the vanishing-gradient problem, reinforce feature propagation, encourage reusing features, and reduce the number of parameters. DenseNet121 has 121 layers and it was loaded with pre-trained weights from ImageNet database.

Inception network or GoogLeNet is a 22-layer network. There are three versions of inception; inceptionV1, inceptionV2 and inceptionV3. Each block in the Inception V3 network has various convolutions, maximum pooling, average pooling, dropouts, and fully-connected layers.

The depth and width of the neural network are two important factors that determine the complexity of the network. There is a popular phenomenon that the training error increases with the depth increase. To solve this problem, the ResNet was proposed. The ResNet accuracy exceeds the traditional networks because it solves the problem of training difficulty caused by network depth [15]. ResNets have fewer filters and lower complexity than VGG nets. Also, they converge faster and achieve better training result because of their high depth and better feature learning [11, 27].

The Xception consists of a linear stack of depth wise separable convolution layers with residual connections. The Xception is an enhancement of the Inception that replaces the regular inception modules with distinguishable depth convolutions. Xception and Inception-v3 have the same model size. In the original Inception module, there is a non-linearity after the first operation. In Xception, there isn’t an intermediate ReLU non-linearity.

Building CNNs demands adjusting some configurations which are tuned manually by the machine learning researcher. The variables of the network structure and the network parameters trained and adjusted in a CNN are called hyperparameters [16]. Hyperparameter optimization mainly targets achieving a satisfactory model or minimizing the loss. However, testing all the possible configurations of hyperparameters is computationally very expensive [11]. For this reason, the optimization of hyperparameters is much needed as has been conducted in many researches. The most common optimization techniques used for hyperparameters tuning were the random search, grid search and the manual search [17]. In [17], it was found that randomized trials are more efficient for hyperparameter optimization than grid-search and it is also easier to be implement.

The main contributions of this work can be summarized as follows: (1) a novel COVID-19 diagnosis system based on hyperparameters optimization for the Residual network and the Xception network for COVID-19 diagnosis using chest X-ray images. (2) Different pretrained convolutional neural networks (CNNs), including ResNet50, ResNet50V2, ResNeXt 50, ResNet 101, ResNet101V2, ResNeXt 101 and Xception Net are investigated for diagnosing COVID-19. (3) Hyperparameters tuning of all CNN models under investigation is optimized automatically using the random search technique without the need of manual tuning of the hyperparameters, that achieve the best diagnosis performance. The diagnosis system is applied to Mendeley chest X-ray images, and the experimental results show its high performance over other recent methods in terms of the test accuracy for both the Xception and the ResNet networks.

The organization of the paper is as follow: Part 2 discuss the related works. Part 3 of the paper introduces Residual Networks, Xception Network, Random Search optimization, the dataset used which is Mendeley Augmented COVID-19 X-ray Images Dataset and our proposed model. Part 4 illustrates the experimental results and then Part 5 presents the discussion. A comparison to others’ work is presented in Part 6. Finally, the conclusions of this proposed optimized system and the future work are provided in Part 7.

2Related works

A number of research work has been carried out on the diagnosis of COVID-19 using artificial intelligence methodologies.

In reference [19], Majeed et al. proposed a Convolutional Neural Network (CNN) architecture such as: Xception and Dense net that is suitable for small datasets. In addition, class activation maps were used to test the CNNs accuracies. In reference [20], the researchers were focusing on achieving a model that could diagnose COVID19 as the way radiologists do with less diagnosis time and they achieved performance of 65%. In reference [21], Butt et al. used ResNet23 and the classical ResNet-18 and they recorded 86.7% accuracy on CT Scans images. In [22], M. Rahimzadeh and A. Attar proposed a neural network that concatenated on the Xception and ResNet50V2 networks for diagnosing COVID-19 and achieved an average accuracy of 91.4%.

In reference [23], R. Jain et al. compared Inception V3, Xception, and ResNeXt models and examined their accuracy by analyzing the model performance on 6432 chest x-ray scans samples have been collected from the Kaggle repository, and they recorded the highest accuracy 97.97% for the Xception model compared to the other models.

3Methods

This section provides the dataset used in this research in section 3.1. Section 3.2 and 3.3 introduce an overview of the Residual Networks and the Xception Network. Finally, section 3.4 provides an introduction for hyperparameters tuning and Random Search optimization and the proposed optimized systems.

3.1Dataset

Recently Mendeley has released Augmented COVID-19 X-ray Images Dataset [18]. This dataset contains augmented X-ray images for COVID-19 disease detection using chest X-Ray images. The dataset consists of 912 COVID-19 images and 912 Non-COVID-19 images. All images are in the.jpeg format.

As proved in reference [24], data augmentation has been shown increased accuracy of classification tasks because it allows a neural network to learn augmentations leading to better ability to improve correctly classifying images.

The dataset was divided into three categories (train/validation/test). The training set is 1534 images (767 COVID-19 and 767 Non-COVID-19), independent validation set contains 152 (76 COVID-19 and 76 Non-COVID-19) and independent test set contains 138 images (69 COVID-19 and 69 Non-COVID-19) images, respectively. Figure 1 visualizes samples of COVID-19 and Non-COVID-19 X-rays present in the database.

Fig. 1

(a) and (b) Samples of COVID-19 images (c) and (d) Samples of Non-COVID-19 images.

(a) and (b) Samples of COVID-19 images (c) and (d) Samples of Non-COVID-19 images.

3.2Residual networks

The Residual Network (ResNet) is one of the most important deep neural networks [25]. ResNets consists of convolution, pooling, activation, and fully-connected layers arranged one after the other. There are many ResNet architectures: two layers deep, for example: ResNet 18, 34 and three layers deep, for example: ResNet 50, 101, 152. An overview of the different architectures is shown in Table 1 [11, 25, 26].

Table 1

Architecture of ResNet 18, 34, 50, 101 and 152 [11, 26]

Layer nameOutput size18 Layers34 Layers50 Layers101 Layers152 Layers
Conv1112×1127×7, 64, stride 2
Conv256×563×3 max pool, stride 2
[3×3,643×3,64]×2 [3×3,643×3,64]×3 [1×1,643×3,641×1,256]×3 [1×1,643×3,641×1,256]×3 [1×1,643×3,641×1,256]×3
Conv328×28 [3×3,1283×3,128]×2 [3×3,1283×3,128]×4 [1×1,1283×3,1281×1,512]×4 [1×1,1283×3,1281×1,512]×4 [1×1,1283×3,1281×1,512]×8
Conv414×14 [3×3,2563×3,256]×2 [3×3,2563×3,256]×6 [1×1,2563×3,2561×1,1024]×6 [1×1,2563×3,2561×1,1024]×23 [1×1,2563×3,2561×1,1024]×36
Conv57×7 [3×3,5123×3,512]×2 [3×3,5123×3,512]×3 [1×1,5123×3,5121×1,2048]×3 [1×1,5123×3,5121×1,2048]×3 [1×1,5123×3,5121×1,2048]×3
1×1Avg pool, Fully connected, Softmax

In ResNet 50, each two-layer block in the 34-layer net is replaced with three-layer block, resulting in a 50-layer ResNet as shown in Table 1. ResNet 50 has 3.8 billion Floating Point Operations Per Second (FLOPs). ResNet 101 and ResNet 152 consist of 101 and 152 layers respectively, due to stacking of the ResNet building blocks as shown in Table 1. Even after increasing the depth, the ResNet 152 has 11.3 billion FLOPs which is lower complexity than VGG16 and VGG19 nets which have 15.3 and 19.6 billion FLOPs, respectively [11].

There are three versions of ResNets (ResNet Version 1, ResNet Version 2 and ResNeXt). Fig. 3 shows the architecture of ResNet version 1 and version 2 [25]. ResNet version 1 (ResNet V1) adds the second non-linearity after performing the addition operation between x and F(x). ResNet V1 performs the convolution followed by batch normalization and ReLU activation. The output of the addition operation in ResNet V1 is obtained after ReLU activation and then transferred to the next block as the new input. Second, ResNet version 2 (ResNet V2) focuses on passing the output of addition operation between the identity mapping and the residual mapping. In ResNet V2, the last non-linearity does not exist, therefore, clearing the path of the input to output in the form of identity connection. ResNet V2 applies batch normalization and ReLU activation to the input before the multiplication with the weight matrix (convolution operation). Third, ResNeXt has different paths of stacked layers, and their outputs that are added.ResNeXt defined a new hyperparameter called “cardinality”, which represents the number of paths existing in each block [28]. Figure 4 shows the architecture difference between a block of ResNet and a block of ResNeXt which includes 32 same blocks so the cardinality equals 32 [29].

Fig. 2

A Residual Block of a ResNet [11, 25].

A Residual Block of a ResNet [11, 25].
Fig. 3

(a) Architecture of ResNet Version 1. (b) Architecture of ResNet Version 2. [25].

(a) Architecture of ResNet Version 1. (b) Architecture of ResNet Version 2. [25].
Fig. 4

(a) Block of ResNet. (b) block of ResNeXt with cardinality = 32 [31].

(a) Block of ResNet. (b) block of ResNeXt with cardinality = 32 [31].

The residual unit calculates F(x) by processing x through two layers as shown in Fig. 2 [30] and H(x) is calculated using Equation (1):

(1)
H(x)=RELU(F(x)+x)

3.3Xception network

The Xception architecture, introduced by Francois [14], is a linear stack of depthwise separable convolution layers with residual connections [32]. The architecture of Xception Network is shown in Fig. 5 [33]. Xception is based on two main methods:

Fig. 5

The Xception Network architecture [33].

The Xception Network architecture [33].

First, the depthwise separable convolution depthwise convolution followed by a pointwise convolution is shown in Fig. 6. The depthwise convolution is the channel-wise n×n spatial convolution. For example, if the network has 7 channels, then we will have 7 n×n spatial convolution. The pointwise convolution is 1×1 convolution [33].

Fig. 6

Depthwise separable convolution [35].

Depthwise separable convolution [35].

Second, the shortcuts between Convolution blocks as in Residual Networks as shown in Fig. 2.

3.4COVID-19 proposed system

The process of searching for a neural network architecture is difficult as there are a lot of design choices. Researchers do not know the optimal architecture to be used for a specific application in advance. Therefore, this paper studies some possibilities that are suggested to be executed by the machine automatically and then the best model architecture is concluded.

A ResNet model is an improved CNN version that adds shortcuts between layers to prevent the distortion that occurs in deeper and more complex networks. In addition, bottleneck blocks are used to enable faster training for ResNet. For these reasons, we have chosen the ResNets to be optimized in the purpose of getting better diagnosis results than older networks. The Xception network consists of a linear stack of depth wise separable convolution layers with residual connections. The Xception model is an enhancement of the Inception that replaces the regular inception modules with distinguishable depth convolutions.

In this paper, both ResNet and the Xception Net configurations were optimized by tuning their hyperparameters to get the optimal architecture for diagnosing our COVID-19 dataset. The hyperparameters of the ResNet and Xception Net models, and their values cannot be predicted from the data. Choosing the correct values of the model hyperparameters enhances the neural network model accuracy [17].

Random search is a simple technique that builds a grid of points and executes trials for each of them independently. These points are selected in a random manner. Rather than defining a set of points for each hyperparameter, the researcher defines a range of search values for these points. Randomized trials are more efficient for hyperparameter optimization than grid-search and it is also easier to be implemented [17].

Also, the points in the hyperparameter space cannot be controlled in the hyperparameter space so that random search optimization can result in a less evenly spaced set of points than hyperparameter optimization [34].

In our proposed system, the input image shown in Fig. 1 is a sample from the COVID-19 dataset [18] which is the input of the convolutional neural network.

First, the original training images were resized to 128×128×3 and augmented (original, rotated and shifted versions of images) with 32 batch size. Xception Net, ResNet50, ResNet50V2, ResNe50, ResNet101, ResNet101V2 and ResNeXt101 were tested on the chosen dataset to measure its accuracy.

Then, the random search optimization technique was applied on the ResNet model with maximum number of trials = 30 and number of epochs = 24. The ResNet model was then trained to optimize the following parameters:

  • “Version of ResNet” that defines which version of ResNet that will be chosen for the model.

  • “Batch size”which represents the number of images processed in parallel. If the mini-batch size is too small, convergence will be slow and if it’s size is too large, the speed will be reduced [17].

  • “Conv3_depth” which is the depth of the third convolutional layer.

  • “Conv4_depth”which is the depth of the fourth convolutional layer.

  • “Pooling type”

  • “Learning rate” which is a very important hyperparameter that determines the amplitude of the jump in each iteration. If the learning rate is too low, it will take a long convergence time and if it is too high it may diverge [17].

  • The”optimizer” which is used with the fully connected layer.

The range of the hyperparameters configurations used for the Random Search optimization is shown in Table 2.

Table 2

Range of ResNet trained hyperparameters

HyperparameterRange
Version[’v1’, ‘v2’, ‘next’]
Batch Size[32]
conv3_depth[4, 8]
conv4_depth[6, 23, 36]
Pooling[’avg’, ‘max’]
Learning rate[0.1, 0.01, 0.001]
Optimizer[’adam’, ‘rmsprop’, ‘sgd’]

Next, different hyperparameters were tuned for the Residual Network (ResNet) model and for the Xception Network (Xception Net) using random search optimization to find the best hyperparameters suitable for diagnosing COVID-19. This is achieved by finding the optimum model which consists of the combination of hyperparameters that give the highest accuracy for the ResNet architecture and for the Xception Net architecture as well.

Finally, the features were extracted using the optimized hyperparameters and then the fully connected layer was used for calculating the classification scores.

Note that the range values are random. The results of training of the combinations of hyperparameters were saved in training logs and the hyperparameters combination model with the best accuracy was chosen to be the best for the database used.

Also, the random search optimization technique was applied on the Xception Net model with maximum number of trials = 30 and number of epochs = 24. The Xception Net model then was trained to optimize the following parameters:

  • The Activation of the Xception Net, which represents the number of 2 dimensional Convolutional filters (Conv2d).

  • “Kernel size” which is the filter size.

  • “Initial strides” which is defined as the step size of the filter.

  • “Number of the separable filters.

  • “Number of the residual blocks” which represents the number of shortcut connections of ResNets as shown in Fig. 2.

  • “Pooling type”.

  • “Number of dense layers in the fully connected layer.

  • “Dropout rate”which is a regularization technique that prevents the network over-fitting, where during training, some neurons in the hidden layer are randomly dropped. So, training happens on various architectures of the neural network on different combinations of the neuronsand the output of multiple networks is used to produce the final output.

  • “Number of batch normalization of dense layer:” which is an important parameter in the training process. During the training, the input is randomly divided into some chunks of equal size instead of sending all the input package, training the data on batches makes the model more generalized as compared to the model built by sending the input dataset to the nework.

  • “Learning rate”which was explained before.

The range of the hyperparameters configurations used for the Random Search optimization of Xception Net is illustrated in Table 3.

Table 3

Range of Xception Network trained hyperparameters

HyperparameterRange
Activation of Xception[’relu’, ‘selu’]
No. of Conv2d Filters[32, 64, 128]
Kernel Size[3, 5]
Initial Strides[2]
Separable_ No. Filters[max:768, min:128, step:128]
No. of Residual Blocks[max:8, min:2, step:1]
Pooling[’avg’, ‘flatten’, ‘max’]
No. of Dense Layers[max: 3, min:1, step:1]
Dropout Rate[max: 0.6, min:0, step: 0.1]
No. of Batch Normalization of Dense Layer[values: [1, 0]]
Learning Rate[values: [0.001, 0.0001, 1e-05]]

Since to our knowledge the random search optimization has not been previously used with ResNet and Xception Net for COVID-19 diagnosis or any other application, this study leads to new optimized ResNet and Xception Net models. And hence, this combination could be further applied for diagnosing other diseases and in this case the diagnosis is achieved with much less computational expenses conducted by the researchers.

4Results

To test the search method, experiments were performed on Mendeley Augmented COVID-19 X-ray Images Dataset using ResNet and Xception Net which have been tuned using random search optimization method. The results were being evaluated by running the two methods on Intel (R) Core (TM) i7- 7700 HQ CPU @ 2.80 GHz.

4.1ResNet_Random Search

The optimum combination of hyperparameters chosen by the random search optimization method for a Residual Network that fits the chosen database which give a training accuracy 100%, a validation accuracy 99.34211% and a test accuracy 99.27536% as shown in Table 5.

Table 4

The optimum combination of hyperparameters chosen by the ReNet tuned using random search optimization

HyperparameterOptimum
Version[v1]
conv3_depth[4]
conv4_depth[23]
Pooling[avg]
Learning rate[0.01]
Optimizer[sgd]
Table 5

Results of the accuracies of some previous versions of ResNet and our proposed ResNet tuned using random search optimization for diagnosing COVID-19

Model no. Model no.Train AccuracyValidation AccuracyTest Accuracy
ResNet5098.56584%96.710527%97.10145%
ResNet50V2100%97.36842%98.5507%
ResNeXt5081.6818%77.63158%81.8840%
ResNet10199.60887%98.02632%99.2753%
ResNet101V272.75098%63.15789%71.01449%
ResNeXt10199.86962%98.68421%99.2753%
ResNet50V2 [22]————-————-89.79%
Xception [22]————-————-91.31%
ResNet50V2 and Xception concatenated [22]————-————-91.40%
ResNet_Random Search (proposed)100%99.34211%99.27536%

Table 5 summarizes the performance of some versions of ResNet trained for 24 epochs with adam optimizer, max pooling, 128 neurons for dense_1, 512 for dense_2 and 1 for dense_3 and compared to our proposed ResNet hyperparameters tuned using random search optimization and tested on the test set.

In this paper, ResNet hyperparameters are tuned using random search optimization and tested on the test dataset and recorded better diagnosing rate for ResNet_Random search and compared to ResNet50, ResNet50V2, ResNeXt50, ResNet101, ResNet101V2, ResNeXt101 applied for the chosen COVID-19 database and showed better results than some previous versions of ResNet as shown in Table 5.

Also, the results to reference [22] are compared with our results in Table 5. In [22], Authors have used two open-source datasets in their work. The covid chestxray dataset is taken from GitHub [36] which consists of X-ray and CT scan images of patients infected to COVID-19, SARS, Streptococcus, ARDS, Pneumocystis, and other types of pneumonia from different patients. The second dataset used in their paper was taken from [37]. They proposed a neural network that is a concatenation of the Xception and ResNet50V2 networks for detecting COVID-19 and achieved average accuracy 91.4% for the test data compared to ResNet50V2 and Xception Net which recorded 89.79% and 91.31% respectively. Note that in reference [22] the authors calculated the test accuracies only.

4.2Xception Net_Random Search

In this paper, Xception Net was tested on the chosen dataset to get the accuracy of it on the test data and it reached 100 % training accuracy, 100% validation accuracy and a test accuracy 100% for our chosen database.

Also, the random search optimization technique was performed on the Xception Net with maximum number of trials = 30 and number of epochs = 24. The model then was trained with the hyperparameter configurations in Table 3. The optimum combination of hyperparameters chosen by Xception network tuned using the random search optimization method that fits this dataset are illustrated in Table 7.

Tuning the Xception Net hyperparameters using random search optimization give a new model with the best combination of hyperparameters in Table 6 that recorded accuracy 100 % on the test data.

Table 6

The optimum combination of hyperparameters chosen by the Xception Network tuned using random search optimization

HyperparameterOptimum
Activation of Xceptionrelu
No. of Conv2d Filters64
Kernel Size5
Initial Strides2
Separable_ No. Filters768
No. of Residual Blocks8
Poolingmax
No. of Dense Layers1
Dropout Rate0.3
No. of Batch Normalization of Dense Layer1
Learning Rate0.0001
Table 7

Results of the accuracies and hyperparameters of Xception, Xception tuned using random search optimization for COVID-19 diagnosis

ModelTrain AccuracyValidation AccuracyTest Accuracy
Xception Network100%97.36842%98.550725%
ResNet50V2 [22]————-————-89.79%
Xception [22]————-————-91.31%
ResNet50V2 and Xception concatenated [22]————-————-91.40%
Xception_Random Search (proposed)100%100%100 %

Table 7 summarize the performance of an Xception Net trained for 24 epochs with “adam” optimizer and max pooling, 128 neurons for dense_1, 512 for dense_2 and 1 for dense_3 applied on Mendeley Augmented COVID-19 X-ray Images dataset [18] and recorded 98.550725% then compared to our proposed Xception hyperparameters tuned using random search optimization and tested on the test set which achieved 100 % diagnosis rate on the test data and a comparison to results obtained by M. Rahimzadeh, A. Attar [22] is also shown. It can be noted from Table 7 that the accuracy improvement is 3% when using the random search optimization. This increase is significant as the standard deviation of accuracy results does not exceed 0.1 % when changing the distribution of training and testing data. This reveals the significant improvement in the diagnosis accuracy when employing the random search optimization. Moreover, this shows the statistical significance of the proposed approach in comparison with some other studies which report very high diagnosis results but with very high standard deviation (high error margin).

The Xception Net tuned hyperparameters using random search optimization showed better results than the previous version of Xception as shown in Table 7.

The validation accuracies, the validation losses curves, the training accuracies and the training losses are compared between the ResNet_Random Search, the Xception Net_Random Search and some previous versions of ResNets and Xception Net as shown in Fig. 7. In Fig. 7 the x-axis represents the number of epochs.

Fig. 7

(a) The Validation accuracy curves of ResNet50, ResNet101, Xception Net, ResNet_Random Search and Xception_Random Search. (b) The Validation Loss curves. (c) The Training accuracy curves.(d) The Training Losses curves.

(a) The Validation accuracy curves of ResNet50, ResNet101, Xception Net, ResNet_Random Search and Xception_Random Search. (b) The Validation Loss curves. (c) The Training accuracy curves.(d) The Training Losses curves.

It is clear from Fig. 7 (a and c) that the curves of the validation accuracies for both the ResNet_Random Search and the Xception_Random Search are higher than ResNet50, ResNet 101 and Xception Net.

It is also noticeable in Fig. 7(b and d) that the Loss curves of both the ResNet_Random Search and the Xception_Random Search converges better than ResNet50, ResNet 101 and Xception Net.

A Confusion matrix is an N by N matrix used for evaluating a classification model performance, where N is the number of target classes. The matrix compares the target values with the predicted values of the machine learning mode as shown in Fig. 8 and it consists of the following:

  • True positives (TP): These are cases in which we predicted the patients who suffer COVID-19, and they actually do have COVID-19.

  • True negatives (TN): We predicted NON-COVID-19, and they are NON-COVID-19 patients.

  • False positives (FP): We predicted COVID-19, but they are actually NON_COVID-19 patients.

  • False negatives (FN): We predicted NON-COVID-19, but they actually suffer from the disease COVID-19.

Fig. 8

Confusion Matrix.

Confusion Matrix.

Figure 9 shows the confusion matrix of the COVID-19 ResNet_Random Search model. And Fig. 10 shows the confusion matrix of the COVID-19 Xception_Random Search model.

Fig. 9

The Confusion Matrix and the classification matrix of COVID-19 ResNet_Random Search.

The Confusion Matrix and the classification matrix of COVID-19 ResNet_Random Search.
Fig. 10

The Confusion Matrix and the Classification Matrix of COVID-19 Xception_Random Search.

The Confusion Matrix and the Classification Matrix of COVID-19 Xception_Random Search.

The classification matrix is also used for measuring the performance of a binary classifier by calculation the accuracy, precision, recall and F1 score.

  • Accuracy: indicates how often the classifier is correct and it is calculated using Equation (2).

    (2)
    Accuracy=(TP+TN)/(TP+TN+FP+FN)

  • Precision: measures how often it correctly predicted the disease and it is calculated by Equation (3).

    (3)
    Precision=(TP)/(TP+FP)

  • Recall: is defined as the number of true positives (TP) over the number of true positives plus the number of false negatives (FN) as in Equation (4).

    (4)
    recall=(TP)/(TP+FN)

  • F1 Score: is a weighted average of the true positive rate (recall) and precision and it is calculated by Equation (5).

    (5)
    F1=2*precision*recall(precision+recall)

5Discussion

5.1A. ResNet_Random Search

Table 8 shows some of the hyperparameters combinations of the ResNet_Random Search trials saved in the training logs.

Table 8

Comparison to some ResNet_Random Search trials

Trial\Hyperparameter12345678910
Versionv1v1v1v1nextNextnextv2v2v2
Conv3 depth4484488884
Conv4 depth232323366623363636
PoolingavgmaxmaxmaxAvgMaxAvgmaxmaxavg
Learning Rate0.010.10.0010.10.0010.10.10.0010.10.001
OptimizersgdsgdadamadamRmspropAdamrmspropRmspropadamadam
Train Accuracy100%95.1760%98.891%63.820%99.8044%99.2829%93.8070%96.8057%51.9556%99.6740%
Val Accuracy99.34211%92.0289%96.376%70.289%95.6521%98.5507%84.7826%86.2318%52.8985%97.1014%

By analyzing the tried combinations of hyperparameters by the random search optimization for the Residual Network, Authors noticed that:

The models with the sgd activation function gives better accuracies than adam and rmsprop with unchanging for the other hyperparameters.

Also, it was noticed that average pooling givebetter results in most combinations than max pooling.

5.2Xception Net_Random Search

Table 9 shows some of the hyperparameters combinations of the Xception Net_Random Search trials saved in the training logs.

Table 9

Comparison to some Xception Net_Random Search trials

Trial\Hyperparameter12345678910
Activation of XceptionrelurelurelurelurelureluseluSeluseluselu
No. of Conv2d Filters6412812864643212812864128
Kernel Size5353535555
Initial Strides2222222222
Separable_ No. Filters768256256512256640384640640640
No. of Residual Blocks8675565464
Poolingmaxavgmaxflattenmaxmaxflattenmaxavgmax
No. of Dense Layers1132212222
Dropout Rate0.30.40.40.20.30.60.20.40.20.4
No. of Batch Normalization of Dense Layer1111001101
Learning Rate0.00010.00010.00010.00010.00010.00010.00010.00010.00010.0001
Train Accuracy100%79.5958%59.6479%89.6349%50.5215%50%53.3898%64.5371%56.9752%64.5371%
Val Accuracy100%92.1052%76.9736%90.1315%50%50%50%85.5263%66.4473%85.5263%

By analyzing the tried combinations of hyperparameters by the random search optimization for the Xception Net, Authors noticed that:

  • The models with the relu activation function givebetter accuracies than selu.

  • Using dropout 0.3 gives better results than 0.4 and 0.2

  • 768 separable filters give better results than 512 and 256 and 384.

6Comparison with recent studies

The proposed method is compared with recent techniques [22, 27, 39–41, 43–46] as shown in Table 10. In [39], a deep learning neural network-based method nCOVnet was proposed for detecting COVID-19 with an overall accuracy 88%. In [22], a neural network was proposed that is a concatenation of the Xception and ResNet 50 V networks for detecting COVID-19 and achieved average accuracy 91.4%. In [40], a PSSPNN achieved five improvements: (a) a proposed module called NCSPM, (b) stochastic pooling usage, (c) PatchShuffle usage, (d) an improved multiple-way data augmentation, and (e) explainability via Grad-CAM. Those five improvements enable their model to get higher performances compared to other 9 methods by achieving 95.79% F1-score. In [41], a COVIDX-Net was designed to diagnose COVID-19 in X-ray images with 90% accuracy rate for a dataset that consists of 25 COVID-19 patient and 25 normal people. In [42], a COVID-Net was proposed for COVID19 diagnosis and with 92.4% accuracy using 16,756 radiography images obtained from different open access data. In [27], three different CNN models: ResNet50, InceptionV3, and InceptionResNetV2 were implemented for diagnosing 50 normal images taken from the Kaggle repository and 50 open access COVID-19 chest X-ray images. It can be noted from Table 10 that our method outperforms all other diagnosis systems in terms of the classification accuracy. Our results show a diagnosing rate improvement for the ResNet and the Xception Net tuned using random search optimization technique compared to others work.

Table 10

Comparison of this work results with other methods

Networks used to diagnose COVID19Method used for COVID-19 diagnosisNumber of Cases in the datasetDiagnosis Rate
H. Panwar, et al. [39]nCOVnet88.00 %
Shui-Hua Wang et al. [40]PSSPNNCOVID-1995.79 %
Community-Acquired Pneumonia (CAP)
Second Pulmonary Tuberculosis (SPT)
M. Rahimzadeh, A. Attar [22]concatenation of the Xception and ResNet50V2 networks91.40 %
Hemdan et al. [41]COVIDX-Net25 COVID-1990.00 %
25 Normal
Narin et al. [27]ResNet50, ResNet101, ResNet152, InceptionV3 and Inception-ResNetV2Dataset 1: 341 COVID-19 and 2800 Normal96.10 %
Dataset 2: 341 COVID-19 and 1493 Viral Pneumonia99.50 %
Dataset 3: 341 COVID-19 and 2772 Bacterial Pneumonia99.70 %
Sethy and Behra [43]ResNet50 + SVM25 COVID-1995.38 %
25 Non-COVID-19
Ying et al. [44]DRE-Net777 COVID-1986.00 %
708 Non-COVID-19
Wang et al. [45]M-Inception195 COVID-1982.90 %
258 Non-COVID-19
Zheng et al. [46]UNet+3D Deep Network313 COVID-1990.80 %
229 Non-COVID-19
ReNet_Random Search (our proposed)912 COVID-1999.28 %
912 Non-COVID-19
Xception_Random Search (our proposed)912 COVID-19100.00 %
912 Non-COVID-19

7Conclusions and future work

In response to the spread of COVID-19, Scientists are trying to find an efficient diagnostic system for its treatment. This area of involves the researchers of many fields such as: Data Science, Machine learning and Artificial Intelligence, to avoid and handle this disease.

This paper elaborates various design decisions as choosing the network architecture and the network’s hyperparameters which can be used to diagnose COVID19 quickly and reduce the pressure of physicians in that aspect considering the large number of X-rays that have to be examined each day around the world. It can be concluded that hyperparameters tuning for the Residual Network and the Xception Network using random search gives better accuracy in Covid-19 diagnosing rate than some previous versions of Residual Networks and Xception Network. By comparing the ResNet_random search optimization and Xception Net_Random search, it was found thatXception Net give better results than ResNet for diagnosing the Mendeley Augmented COVID-19 X-ray Images Dataset as the Xception Net_Random Search gives higher accuracy, Precision.

Despite the high performance of the proposed approach, it could further make use of being examined using larger number of images from different databases. In the future, we intend to validate our model by incorporating more images. This developed model can be placed in a cloud to provide instant diagnosis. This should reduce clinician workload significantly. Also, we will try to collect local radiology images for COVID-19 cases and evaluate them with our model from sites in Egypt. After the necessary tests are done, we aim to deploy the developed model in local hospitals for screening. The hardware implementation of the proposed diagnosis system is a goal for future investigation as well. Also, in the future, it would be interesting to examine other optimization techniques. Moreover, https://en.wikipedia.org/wiki/Bayesian_optimizationresidual and Xception networkswere used in this paper for diagnosing COVID-19, other types of convolutional neural networks such as google Net and Inception Net can be used and compared with those obtained by ResNet and Xception Net to test the accuracy of the other networks. On another side, the ResNet and the Xception Net tuned using Random Search optimization can be used for many other applications such as heart diagnosis and breast cancer detection.

Ethical statement

All authors have reported that they have no relationships relevant to the contents of this paper to disclose. All the ethical guidelines were followed during the research work.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank Guangzhou Women and Children’s Medical Center for uploading their datasets used for improving the diagnosing systems.

References

[1] 

Ozturk T. , Talo M. , Yildirim E.A. , Baloglu U.B. , Yildirim O. and Rajendra Acharya U. , Automated detection of COVID-19 cases using deep neural networks with X-ray images, Comput Biol Med 121 (2020), 103792. https://doi.org/10.1016/j.compbiomed.2020.103792.

[2] 

Ozsahin I. , Sekeroglu B. , Musa M.S. , Mustapha M.T. and Uzun Ozsahin D. , Review on Diagnosis of COVID-19 from Chest CT Images Using Artificial Intelligence, Comput Math Methods Med (2020), 2020. https://doi.org/10.1155/2020/9756518.

[3] 

Qu S. , Xu Y. , Wu Z. , Xu Z. , Ji Y. , Qu D. and Han Y. , An interval-valued best–worst method with normal distribution for multi-criteria decision-making, Arabian Journal for Science and Engineering 46(2) (2021), 1771–1785.

[4] 

Qu S. , Cai H. , Xu D. and Mohamed N. , Correction to:Uncertainty in the prediction andmanagement of CO 2 emissions: a robust minimum entropy approach, Natural Hazards 1–1.

[5] 

Rizk M.R. , Farag H.H. and Said L.A. , Neuralnetwork classification for iris recognition using both particle swarm optimization and gravitational search algorithm, In 2016 World Symposium on Computer Applications & Research (WSCAR) (2016), (pp. 12-17). IEEE.

[6] 

Salama M.S. , Eltrass A.S. and Elkamchouchi H.M. , An improved approach for computer-aided diagnosis of breast cancer in digital mammography, 13th Annual IEEE International Symposium on Medical Measurements and Applications, Rome, Italy, (2018), 1–5.

[7] 

Eltrass A.S. and Salama M. , Fully automated scheme for computer-aided detection and breast cancer diagnosis using digitised mammograms, IET Image Processing 14(3) (2020), 495–505.

[8] 

Malik N. and Singh P.V. , Deep Learning in Computer Vision: Methods, Interpretation, Causation, and Fairness, In Operations Research & Management Science in the Age of Analytics (2019), (pp. 73–100). INFORMS.

[9] 

Abiyev R.H. and Ma’aitah M.K.S. , Deep Convolutional Neural Networks for Chest Diseases Detection, J Healthc Eng (2018), 2018. https://doi.org/10.1155/2018/4168538.

[10] 

VGG Net n.d. https://neurohive.io/en/popular-networks/vgg16/ (accessed May 30, 2020).

[11] 

He K. , Zhang X. , Ren S. and Sun J. , Deep residual learning for image recognition, Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016 (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90.

[12] 

Huang G. , Liu Z. , Van Der M.L. and Weinberger K.Q. , Densely Connected Convolutional Networks Gao, 2017 IEEE Conf. Comput. Vis. Pattern Recognit (2016), 2261–2269.

[13] 

Szegedy C. , Liu W. , Jia Y. , Sermanet P. , Reed S. , Anguelov D. , et al., Going Deeper with Convolutions, Des Track Knowl Manag Metrics 2015, 163–182. https://doi.org/10.1108/978-1-78973-723-320191012.

[14] 

Chollet F. , Xception: Deep learning with depthwise separable convolutions, Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017;2017-Janua:1800–7. https://doi.org/10.1109/CVPR.2017.195.

[15] 

Han L. , Yu C. , Xiao K. and Zhao X. , A new method of mixed gas identification based on a convolutional neural network for time series classification, Sensors (Switzerland) 19 (2019), 1–23. https://doi.org/10.3390/s19091960.

[16] 

Aszemi N.M. and Dominic P.D.D. , Hyperparameter optimization in convolutional neural network using genetic algorithms, Int J Adv Comput Sci Appl 10 (2019), 269–278. https://doi.org/10.14569/ijacsa.2019.0100638.

[17] 

Llamas J. , Lerones P.M. , Medina R. , Zalama E. and Gómez-García-Bermejo J. , Classification of architectural heritage images using deep learning techniques, Appl Sci 7 (2017), 1–26. https://doi.org/10.3390/app7100992.

[18] 

Augmented COVID-19 X-ray Images Dataset n.d. https://data.mendeley.com/datasets/2fxz4px6d8/4 (accessed July 3, 2020).

[19] 

Majeed T. , Rashid R. , Ali D. and Asaad A. , Covid-19 Detection using CNN Transfer Learning from X-ray Images, MedRxiv 2020:2020.05.12.20098954. https://doi.org/10.1101/2020.05.12.20098954.

[20] 

Chen J. , Wu L. , Zhang J. and Al E. , Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study, MedRxiv 2020 (2019), 1–27. https://doi.org/10.1101/2020.02.25.20021568.

[21] 

Butt C. , Gill J. , Chun D. and Babu B.A. , A Deep Learning System to Screen Novel Coronavirus Disease Pneumonia, Appl Intell 2020 (2019), 1–7. https://doi.org/10.1016/j.eng.2020.04.010.

[22] 

Rahimzadeh M. and Attar A. , A modified deep convolutional neural network for detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2, Informatics Med Unlocked 19 (2020), 100360. https://doi.org/10.1016/j.imu.2020.100360.

[23] 

Jain R. , Gupta M. , Taneja S. and Hemanth D.J. , Deep learning based detection and analysis of COVID-19 on chest X-ray images, Appl Intell (2020). https://doi.org/10.1007/s10489-020-01902-1.

[24] 

Wang J. and Perez L. , The effectiveness of data augmentation in image classification using deep learning, ArXiv 2017;abs/1712.0.

[25] 

Detailed Guide to Understand and Implement ResNets n.d. https://cv-tricks.com/keras/understand-implement-resnets/ (accessed June 9, 2020).

[26] 

Resnets 2 n.d. https://neurohive.io/en/popular-networks/resnet/ (accessed June 9, 2020).

[27] 

Narin A. , Kaya C. and Pamuk Z. , Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks Ali, ArXiv Prepr ArXiv200310849 (2020).

[28] 

Keras ResNet: Building, Training & Scaling Residual Nets on Keras n.d. https://missinglink.ai/guides/keras/keras-resnet-building-training-scaling-residual-nets-keras/.

[29] 

Enhancing ResNet to ResNeXt for image classification n.d. https://medium.com/dataseries/enhancing-resnet-to-resnext-for-image-classification-3449f62a774c (accessed June 28, 2020).

[30] 

Resnet equations n.d. https://shuzhanfan.github.io/2018/11/ResNet/ (accessed June 11, 2020).

[31] 

Xie S. , Girshick R. , Dollár P. , Tu Z. and He K. , Aggregated residual transformations for deep neural networks, Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017;2017-Janua:5987–95. https://doi.org/10.1109/CVPR.2017.634.

[32] 

Kassani S.H. , Kassani P.H. , Khazaeinezhad R. , Wesolowski M.J. , Schneider K.A. and Deters R. , Diabetic Retinopathy Classification Using a Modified Xception Architecture, IEEE 19th Int. Symp. Signal Process. Inf. Technol. ISSPIT 2019, (2019). https://doi.org/10.1109/ISSPIT47144.2019.9001846.

[33] 

Chollet F. , Xception: Deep Learning with Depthwise Separable Convolutions, SAE Int J Mater Manuf 7 (2014), 1251–1258. https://doi.org/10.4271/2014-01-0975.

[34] 

Palacios Cuesta A. , Hyperparameter Optimization for Large-scale Machine Learning, Technical University of Berlin (2018). https://doi.org/10.13140/RG.2.2.33876.65927.

[35] 

Bendersky E. , Depthwise separable convolutions for machine learning n.d. https://eli.thegreenplace.net/2018/depthwise-separable-convolutions-for-machine-learning/ (accessed August 15, 2020).

[36] 

ieee covid-chestxray-dataset n.d. https://github.com/ieee8023/covid-chestxray-dataset (accessed July 3, 2020).

[37] 

RSNA Pneumonia Detection Challenge n.d. https://www.kaggle.com/c/rsna-pneumonia-detection-challenge.

[38] 

machine learning n.d. https://www.ritchieng.com/machine-learning-evaluate-classification-model/.

[39] 

Panwar H. , Gupta P.K. , Khubeb M. and Morales-menendez R. , Application of deep learning for fast detection of COVID-19 in X-Rays using nCOVnet, Chaos, Solitons Fractals An Interdiscip J Nonlinear Sci 138 (2020), 1–8.

[40] 

Wang S.H. , Zhang Y. , Cheng X. , Zhang X. and Zhang Y.D. , PSSPNN: PatchShuffle Stochastic Pooling Neural Network for an Explainable Diagnosis of COVID-19 with Multiple-Way Data Augmentation, Comput Math Methods Med (2021), 2021. https://doi.org/10.1155/2021/6633755.

[41] 

Hemdan E.E.D. , Shouman M.A. and Karar M.E. , COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose COVID-19 in X-Ray Images, ArXiv (2020).

[42] 

Wang L. , Lin Z.Q. and Wong A. , COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. vol. 10. Nature Publishing Group UK; (2020). https://doi.org/10.1038/s41598-020-76550-z.

[43] 

Sethy P.K. and Behera S.K. , Detection of Coronavirus Disease (COVID-19) Based on Deep Features, (2020).

[44] 

Song Y. , Zheng S. , Li L. , Zhang X. , Zhang X. , Huang Z. and Chong Y. , Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images, medRxiv (2020).

[45] 

Wang S. , Kang B. , Ma J. , Zeng X. , Xiao M. , Guo J. and Xu B. , A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19), medRxiv (2020).

[46] 

Zheng C. , Deng X. , Fu Q. , Zhou Q. , Feng J. , Ma H. and Wang X. , Deep learning-based detection for COVID-19 from chest CT using weak label, medRxiv (2020), https://doi.org/10.1101/2020.03.12.20027185.