You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Automatic gender and unilateral load state recognition for biometric purposes

Abstract

BACKGROUND:

Automatic recognition of a person’s gender as well as his or her unilateral load state are issues that are often analyzed and utilized by a wide range of applications. For years, scientists have recognized human gait patterns for purposes connected to medical diagnoses, rehabilitation, sport, or biometrics.

OBJECTIVE:

The present paper makes use of ground reaction forces (GRF) generated during human gait to recognize gender or the unilateral load state of a walking person as well as the combination of both of those characteristics.

METHODS:

To solve the above-stated problem parameters calculated on the basis of all GRF components such as mean, variance, standard deviation of data, peak-to-peak amplitude, skewness, kurtosis, and Hurst exponent as well as leading classification algorithms including kNN, artificial neural networks, decision trees, and random forests, were utilized. Data were collected by means of Kistler’s force plates during a study carried out at the Bialystok University of Technology on a sample of 214 people with a total of 7,316 recorded gait cycles.

RESULTS:

The best results were obtained with the use of the kNN classifier which recognized the gender of the participant with an accuracy of 99.37%, the unilateral load state with an accuracy reaching 95.74%, and the combination of those two states with an accuracy of 95.31% which, when compared to results achieved by other authors are some of the most accurate.

CONCLUSION:

The study has shown that the given set of parameters in combination with the kNN classifying algorithm allows for an effective automatic recognition of a person’s gender as well as the presence of an asymmetrical load in the form of a hand-carried briefcase. The presented method can be used as a first stage in biometrics systems.

1.Introduction

Gait is one of the most complex, unconsciously performed, activities by human beings. It is a natural and common manner of getting about. For this reason, its measurement and analysis are carried out in a wide range, often completely unrelated, applications, such as medical diagnostics, rehabilitation, healthcare, human-machine interactions, or marketing [1].

Human gait may also be used in biometrics understood as the identification of a particular person. To improve security, in biometrics-connected applications recognizing gait, so-called soft biometrics like body height [2] or gender recognition are sometimes utilized. To identify a person’s gender on the basis of his/her gait signals gathered through the use of motion capture systems including dynamometric platforms [3, 4, 5], video camera [6], electromyography [1] as well as wearable sensors such as gyroscopes or accelerometers [7] are employed. In a paper [3] dealing with gender and age recognition characteristics identified through centers of pressure were used. The utilization of SVM classifiers allowed the achievement of accuracy reaching 99.65% concerning gender and 97.22% regarding age. The results were obtained from a sample consisting of 24 participants.

The recognition of the type of activity a person is engaged in is an oft-analyzed issue within literature [8, 9] and may be used in healthcare or to enhance security (e.g. in a video surveillance environment). Additionally, the problem of identifying one such everyday activity like walking with a briefcase (asymmetrical load) is addressed in several works connected with human gait recognition. This type of movement, in comparison with unencumbered walking, significantly changes a given person’s gait pattern [10]. When it comes to biometric systems, independent of physical parameters that are measured, the act of carrying a briefcase very negatively impacts the accuracy of a human gait recognition system [11]. Lv et al. [12] have shown that asymmetrical carrying of loads reduces gait symmetry and that a load carried in the right hand is characterized by a greater symmetry than when it is carried in a person’s left hand. Gait symmetry is inversely proportional to the level of loading. In [13] the impact of various ways of carrying backpacks by school-aged children on GRF and temporal characteristics was analyzed. Manners of carrying a backpack included walking without a backpack, with it hanging low on the child’s back, high on the back, and carried by the handle. When carrying backpacks children took smaller steps and walked at lower speeds with greater vertical ground reaction forces being measured than when walking without them (P< 0.01). It was also ascertained that, in comparison to typical gait biomechanics, the greatest changes were recorded when the backpack was carried by its handle. Uddin et al. [14] showed how walking with asymmetrical loads changes how a person moves as well as analyzed individual parameters that were thus affected.

The purpose of the present study was to present a method and set of parameters identified on the basis of elements of GRF that allow the recognition of the participant’s gender as well as the presence of asymmetrical loading. In addition, it was assumed that this method can be used as the first stage of human identification in biometric systems.

2.Materials and method

2.1Signals

Measurements of GRFs made as part of this study were performed using two Kistler platforms with the dimensions of 60 cm × 40 cm registering data with a frequency of 960 Hz. Registered signals form a time series x1,x2,,xN where N is the number of samples. Generally, the duration time of the support phase of a person’s gait depends on several factors and varies so N is variable.

2.2Features

To eliminate the impact of the duration of the support phase on the possibilities of comparing two gait cycles the following parameters were applied. These parameters were selected on the basis of the study by Derlatka and Borowska [15]:

  • Mean of the signal:

    (1)
    x¯=1Ni=1Nxi=x1+x2++xNN

  • Variance of the signal:

    (2)
    var=1Ni=1N|xi-x¯|2

  • Standard deviation of the signal:

    (3)
    σx=1Ni=1N(xi-x¯)2

  • Peak-to-peak (ptp) amplitude of the signal:

    (4)
    𝑝𝑡𝑝=(max(x)-min(x))

  • Skewness of the signal is computed as the Fisher-Pearson coefficient of skewness:

    (5)
    𝑠𝑘𝑒𝑤=m3m23/2,

    where mi=1ni=1n(xi-x¯)k is the biased kth sample central moment

  • Kurtosis of the signal:

    (6)
    𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠=m4var2

  • Hurst exponent of the signal is calculated from the rescaled range and average over all the partial time series of length N:

    (7)
    (RS)t=RtSt

    where R/S is averaged over the regions [x1,xt], [xt+1,x2t] until [x(l-1)t+1,xlt] where l=𝑓𝑙𝑜𝑜𝑟(N/t), t=1,2,,N, R is range series, S standard deviation series. Hurst exponent is defined as the slope of the least-squares regression line going through a cloud of partial time series [16].

It is also necessary to specify that signal features were calculated independently for each component of GRF and separately for each leg. Thus created input space consisted of a total of 42 parameters. Since the values of obtained parameters vary significantly from one another it becomes necessary to standardize them before classification using the following equation:

(8)
x𝑠𝑡𝑑=x𝑜𝑙𝑑-x𝑜𝑙𝑑¯σ

where: σ – standard deviation of the i-th feature value before standardization; x𝑜𝑙𝑑¯ – mean of the i-th feature value before standardization.

2.3Classifiers

An important role in the process of identifying a person or his/her gender as well as the activity which he/she may be doing, is played by the classifier. Within the presented solution it had been decided to test several well-known algorithms. Every classifier was trained for group of features listed in subsection 3.2 using 10 folds of cross-validation. Each time the same division of data into folds was utilized thanks to which results obtained by different classifiers were comparable. The quality of every classifier was determined by its accuracy. This represents the proportion of true positive results (both true positive as well as true negative) in the selected population.

2.3.1K nearest neighbor (kNN)

The k nearest neighbor classifier (kNN) decides to assign a new point in feature space to a particular class on the basis of distances from that point to its k nearest neighbors. The distance may be determined using various metrics among which the most popular include the Euclidian distance, city block (Manhattan) metric, or the Chebychev distance.

2.3.2Naive Bayes

The Naive Bayes classifier is based on the Bayes Theorem:

(9)
P(C|X)=P(X|C)P(C)P(X)

where: P(C) – the prior probability of class C; P(X|C) – the likelihood which is the probability of predictor X given class C; P(X) – is the prior probability of predictor.

In this classifier, it is assumed that each input variable is independent which is usually not true with respect to real data. However, despite this unrealistic assumption Bayes classifiers often produce good results.

2.3.3Artificial neural networks (ANNs)

Artificial neural networks consist of appropriately connected structures consisting of single artificial neurons. In ANNs signals provided to the input nodes are multiplied by values called weights connected to individual synaptic connections. The processing of information also occurs within the neuron itself. The training of an artificial neuron network comes down to the selection of weights in a way that an output of the ANN for a provided input signal is as close as possible to the desired value. Within the present work feedforward networks (MLPs) with no more than two hidden layers were used.

2.3.4Linear discriminant analysis

Linear discriminant analysis (LDA) finds a linear combination of features that best differentiate between classes. Combinations of results are used as linear classifiers or to reduce the dimensionality of the input space. The present work made use of the regularized linear discriminant analysis described in detail in [17] where it is assumed that all classes possess the same covariance matrix:

(10)
Σγ^=(1-γ)Σ^+γ𝑑𝑖𝑎𝑔(Σ^)

where: Σ^ is the empirical, pooled covariance matrix; γ is the amount of regularization.

2.3.5Support vector machines

Support vector machines (SVMs) are a classification algorithm that builds a hyperplane separating two classes in such a way as to separate the classes by a maximum margin. SVMs most often use a nonlinear transformation based on a so-called kernel, which projects the original problem into a feature space with more dimensions. This new feature space makes it easier to find a solution.

2.3.6Classification and regression trees

Classification and Regression Trees (CART) are binary trees with one-dimensional divisions. Within the node of the tree, a condition is created by verifying all possible divisions in points that are mid-points of segments between subsequent sorted xj and xj+1 values. The best division is a division that separates input data into relatively homogeneous subsets. Impurity assessment I(tr) after the division may be done, for example, by applying Gini’s index:

(11)
I(tr)=1-j=1Ncpj2

where:

  • pk – frequency of the occurrence of elements from class j after the division;

  • Nc – the number of all classes.

2.3.7Random forest

Random Forest (RF) is a collection of many relatively simple decision trees. Random forests generally give better results than single trees. To make this advantage visible, some differentiation is used between generated trees. Differentiation can be provided by randomly selecting a training set for each tree. Such a set contains a strictly defined percentage of cases of the entire teaching set. Classification occurs similarly to other methods of combining ensemble classifiers. Majority voting is the most frequently chosen strategy.

2.4The study group

The research was carried out at the laboratories of the Faculty of Mechanical Engineering of the Bialystok University of Technology on a sample of 214 people including 92 women and 122 men. The people taking part in the research were at ages 21.34 ± 1.16, body weight: 74.32 ± 16.63 kg and body height 174.39 ± 9.49 cm. The investigations were performed according to the procedure described in [18]. During the tests, participants walked through a testing path whose length exceeded 10 meters and within which there were hidden two Kislter’s force plates. Every participant walked in their own sports shoes and at a speed of their choosing. After recording from 14 to 20 gait cycles study participants were asked to place in the hand of their choosing a briefcase weighing 4.6 kg. Each participant then performed another 14 to 16 recorded gait cycles with the briefcase. A total of 7,316 gait cycles were recorded of which 3,941 were performed without loading (without a briefcase) and 3,375 were done while carrying a briefcase.

The research had been approved by the Bioethics Commission, Regional Medical Chamber in Bialystok, Poland and Bioethics Commission at Medical University of Bialystok.

3.Results and discussion

Table 1

Accuracy of gender recognition of a participant, recognition of carrying a briefcase state as well as the combination of those two characteristics according to classifier

Classifier nameGenderCarrying briefcaseGender/carrying briefcase
kNN 99.37% 95.74% 95.31%
Naive Bayes74.9%56.55%44.34%
SVM94.67%69.49%80.11%
ANN95.10%71.90%64.16%
LDA86.81%61.36%61.25%
CART90.00%76.79%72.38%
RF96.63%93.45%92.44%

Table 1 presents the results of a given person’s gender recognition, probability of carrying a briefcase as well as a combination of those two characteristics simultaneously with values for selected classifiers. It should be stressed that several simulations permitting the discovery of optimum parameter values for each particular type of classifier were conducted. Table 1 contains the best of the obtained results. The impact of the number of neighbors and distance function has been shown in Fig. 1. The presented graphs prove that the selection of the right metric has a significantly higher influence on obtained results than the number of neighbors considered during the classification. It should be noted that in all kNN classifiers the tie-breaking algorithm uses the class with the nearest neighbor among tied groups.

Figure 1.

Accuracy of recognition concerning the number of neighbors and distance metrics used for a) gender recognition, b) briefcase-carrying recognition, c) recognition of gender as well as briefcase carrying. Line colors for individual distances: Euclidean – red solid line (‘-’), city block – blue dashed line (‘- -’), Chebyshev – black dotted line (‘’), Mahalabonis – green dash-dotted line (‘-’).

Accuracy of recognition concerning the number of neighbors and distance metrics used for a) gender recognition, b) briefcase-carrying recognition, c) recognition of gender as well as briefcase carrying. Line colors for individual distances: Euclidean – red solid line (‘-’), city block – blue dashed line (‘- -’), Chebyshev – black dotted line (‘⋅⁣⋅’), Mahalabonis – green dash-dotted line (‘-⋅’).

3.1Gender recognition

The analysis of the results presented in Table 1 shows that the recognition of the gender of a considered person based on parameters identified through GRF is a relatively simple task. Results obtained were at over 90% of correct identification for most classifiers. The kNN classifier turned out to be the best (k = 4, city block) reaching an accuracy of 99.37%. The worst is Naive Bayes classifier with its accuracy not exceeding 75% of correct recognitions. It is worth noting that the results were undoubtedly influenced by the fact that the value of the vertical element of GRF is highly dependent on the weight of the person being considered and the average weight of men who participated in the study (82.56 kg) was significantly higher than the average weight of women (63.45 kg). A more detailed analysis of data contained in the confusion matrix (Table 2) indicated that for the kNN classifier, the error percentage was minimal and that heavier women were confused nearly twice as often with lighter men than the other way around.

A similar issue was presented in [8] where 30 parameters were selected on the basis of such components as the vertical and anterior-posterior GRF additionally enriched with two temporal parameters. The authors had at their disposal a total of 64 parameters that were determined for the gait data of 15 people. These parameters were divided into 6 groups and the accuracy of gender recognition depending on the combination of parameter groups was analyzed. The classification utilized a feedforward neural network with two hidden layers. The most effective combination group of parameters allowed the correct recognition of gender with an accuracy reaching 94.03%.

Table 2

Confusion matrix for gender recognition, kNN

Actual labelsPredicted labels
WomenMen
Women99.13%0.87%
Men0.45%99.55%

3.2Unilateral load state recognition

Differentiation between people who are carrying a briefcase and those who are not turned out to be a much more difficult task (Table 1). Only two classifiers reached an accuracy of recognition exceeding 90% and all classifiers had worse results than when it came to the recognition of a person’s gender. In this respect, similarly to the previous consideration, the worst results were achieved by the Naive Bayes classifier while the best remained the domain of the kNN classifier (k = 6, city block). The problem of gait recognition for a person carrying a briefcase is connected to, among others, the fact that the case weighing 4.6 kg increases the mean of vertical GRF by over 19 N and the ptp amplitude for that component by approximately 50 N. Of course, the carrying of a briefcase also impacts other parameters utilized for recognition in the present study, those connected to the anterior-posterior and medial-lateral components. This is the result of changes within system dynamics related to greater weight, increased by the briefcase, and a disrupted gait as well as, with some people, magnified gait asymmetry. The confusion matrix for kNN (Table 3) indicates that a person carrying a briefcase was identified as one that was not only slightly more often than the other way around.

Table 3

Confusion matrix for carrying briefcase activity recognition, kNN

Actual labelsPredicted labels
Carrying briefcaseWithout briefcase
Carrying briefcase95.05%4.95%
Without briefcase3.68%96.32%

The difficulty of this task is strongly emphasized in the results reported by other authors. In [14] several different carrying status level instances including no carried object, objects being carried in the side middle region, in the side bottom region, in the front region, in the back region, in multiple regions, and carried objects with the position being changed from one region to another within period gait, were considered. The method proposed by the above work’s authors allowed the recognition of 76.8% of cases in which a person did not carry anything and 73% of cases where a person carried a load in a similar manner to the one being considered in the present work. It is worth mentioning that of all the stipulated carrying status levels those two were most often confused.

Video cameras were also used to record people’s walking patterns in [11]. The authors of this work were able to achieve 94.4% of correct recognition for people carrying a briefcase. This result was reached for 1,240 images of people (including 248 carrying bags) using a MLP. When it comes to the CASIA-B database the level of identification of a person reached at most 86.7%.

3.3Gender as well as unilateral load state recognition

The simultaneous recognition of both gender and briefcase-carrying generally provided an even lower accuracy. Similarly to the results presented above the kNN (k = 6, city block) reached the highest accuracy of all classifiers. It was a bit of a surprise that the SVM classifier achieved better results here than in the previous task. This could be influenced by the fact that the SVM is a dichotomizer and multi-class classification was done through the use of a one vs. one scheme which in reality required the utilization of 6 classifiers with the final decision being reached through a majority vote. It is also worth noting that the Random Forest classifier did at least well in all tasks indicating a certain potential of ensemble classifiers.

Table 4

Confusion matrix for recognition of both gender and briefcase-carrying, kNN (W_CB – women with briefcase, W_nB – women without briefcase, M_CB – man with briefcase, M_nB- man without briefcase)

Actual labelsPredicted labels
W_CBW_nBM_CBM_nB
W_CB93.16%5.13%0.35%0.91%
W_nB3.06%95.88%0.24%0.82%
M_CB0.26%0.05%95.80%3.89%
M_nb0.36%0.45%3.66%95.54%

The analysis of the confusion matrix (Table 4) performed for the classifier that gained the best results in this task shows that misidentifications most often occur in differentiating between people who carry a briefcase and those who do not.

It should be said that the author of the present work is not aware of any other work that deals with the recognition of a unilateral load state based on the measurement of GRFs. In addition, the presented method is expected to be the first step in the biometrics systems so direct comparisons of the above results with those achieved by others should be done with a certain measure of caution.

4.Conclusions

The presented results have shown the importance of the GRF and parameters proposed for gender recognition, the recognition of the activity of carrying a briefcase by the handle, and the combination of those two characteristics. On this basis, the kNN classifier was able to achieve up to 99.37%, 95.74%, and 95.31% of correct recognitions respectively. Further work may involves a search for characteristics and classification algorithms (connected to, for example, deep learning or ensemble classifiers) that will allow the achievement of even better results. The other possibilities are connected with seeking a fusion with other measuring systems that could help to isolate unique information within considered phenomena.

Conflict of interest

The author does not have any conflict of interest to declare.

Funding

This work was financed by Bialystok University of Technology (project no. W/WM-IIB/2/2021).

References

[1] 

Lee M, Lee JH, Kim DH. Gender recognition using optimal gait feature based on recursive feature elimination in normal walking. Expert Syst Appl. (2022) ; 189: : 116040. doi: 10.1016/j.eswa.2021.116040.

[2] 

Derlatka M, Bogdan M. Fusion of static and dynamic parameters at decision level in human gait recognition. In: Kryszkiewicz M, Bandyopadhyay S, Rybinski H, Pal S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2015. LNCS, 9124. Springer, Cham. doi: 10.1007/978-3-319-19941-2_49.

[3] 

Chen YJ, Chen LX, Lee YJ. Systematic evaluation of features from pressure sensors and step number in gait for age and gender recognition. IEEE Sens J. (2021) ; 22: (3): 1956-1963. doi: 10.1109/JSEN.2021.3136162.

[4] 

Jena S, Panda SK, Arunachalam T. Pattern Recognition for Identification of Gender of Individuals from Ground Reaction Force Parameters. In: 2018 IEE Congress (iEECON). IEEE. (2018) ; pp. 1-4 doi: 10.1109/IEECON.2018.8712215.

[5] 

Grabski JK, Walczak T, Michałowska M, Cieślak M. Gender recognition using artificial neural networks and data coming from force plates. In: Gzik M, Tkacz E, Paszenda Z, Piȩtka E. (eds) Innovations in Biomedical Engineering. Adv Intell Sys Comp, Springer, Cham. Vol. 623. (2018) ; pp. 53-60. doi: 10.1007/978-3-319-70063-2_6.

[6] 

Huang B, Luo Y, Xie J, Pan J, Zhou C. Attention-aware spatio-temporal learning for multi-view gait-based age estimation and gender classification. IET Comp Vis. doi: 10.1049/cvi2.12165.

[7] 

Jain A, Kanhangad V. Investigating gender recognition in smartphones using accelerometer and gyroscope sensor readings. In: 2016 Compt Techq Inf Comm Techn (ICCTICT). IEEE. (2016) ; pp. 597-602. doi: 10.1109/ICCTICT.2016.7514649.

[8] 

Gupta JP, Singh N, Dixit P, Semwal VB, Dubey SR. Human activity recognition using gait pattern. Inter J Comp Vis Image Process. (2013) ; 3: (3): 31-53. doi: 10.4018/ijcvip.2013070103.

[9] 

Semwal VB, Gupta A, Lalwani P. An optimized hybrid deep learning model using ensemble learning approach for human walking activities recognition. J Supercomput. (2021) ; 77: : 12256-12279. doi: 10.1007/s11227-021-03768-7.

[10] 

Nezhad PH. Effect of carrying a handbag on spine EMG muscles activity during walking. Gait Posture. (2021) ; 90: : 80-81. doi: 10.1016/j.gaitpost.2021.09.042.

[11] 

Fendri E, Chtourou I, Hammami M. Gait-based person re-identification under covariate factors. Pattern Anal Applic. (2019) ; 22: : 1629-1642. doi: 10.1007/s10044-019-00793-4.

[12] 

Lv S, Huan Z, Chang X, Huan Y, Liang J. Analysis of Gait Symmetry Under Unilateral Load State. In: Hao Z, Dang X, Chen H, Li F. (eds) Wireless Sensor Networks. CWSN 2020. Comm Comp Inform Sci. 2021. 1321. Springer, Singapore. doi: 10.1007/978-981-33-4214-9_18.

[13] 

Kellis E, Arampatzi F. Effects of sex and mode of carrying schoolbags on ground reaction forces and temporal characteristics of gait. J Pediatr Orthop B. (2009) ; 18: (5): 275-282. doi: 10.1097/BPB.0b013e32832d5d3b.

[14] 

Uddin MZ, Ngo TT, Makihara Y, Takemura N, Li X, Muramatsu D, Yagi Y. The ou-isir large population gait database with real-life carried object and its performance evaluation. IPSJ T Comput Vis Appl. (2018) ; 10: (5): 1-11. doi: 10.1186/s41074-018-0041-z.

[15] 

Derlatka M, Borowska M. Ensemble of heterogeneous base classifiers for human gait recognition. Sensors. (2023) ; 23: (1): 508. doi: 10.3390/s23010508.

[16] 

Qian B, Rasheed K. Hurst exponent and financial market predictability. In: IASTED Int Conf Fin Egn App. (2004) ; pp. 203-209.

[17] 

Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays. Biostatistics. (2007) ; 8: : 86-100. doi: 10.1093/biostatistics/kxj035.

[18] 

Derlatka M, Parfieniuk M. Real-world measurements of ground reaction forces of normal gait of young adults wearing various footwear. Sci Data. (2023) ; 10: : 60. doi: 10.1038/s41597-023-01964-z.