You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Geometric MDS Performance for Large Data Dimensionality Reduction and Visualization

Abstract

Multidimensional scaling (MDS) is a widely used technique for mapping data from a high-dimensional to a lower-dimensional space and for visualizing data. Recently, a new method, known as Geometric MDS, has been developed to minimize the MDS stress function by an iterative procedure, where coordinates of a particular point of the projected space are moved to the new position defined analytically. Such a change in position is easily interpreted geometrically. Moreover, the coordinates of points of the projected space may be recalculated simultaneously, i.e. in parallel, independently of each other. This paper has several objectives. Two implementations of Geometric MDS are suggested and analysed experimentally. The parallel implementation of Geometric MDS is developed for multithreaded multi-core processors. The sequential implementation is optimized for computational speed, enabling it to solve large data problems. It is compared with the SMACOF version of MDS. Python codes for both Geometric MDS and SMACOF are presented to highlight the differences between the two implementations. The comparison was carried out on several aspects: the comparative performance of Geometric MDS and SMACOF depending on the projection dimension, data size and computation time. Geometric MDS usually finds lower stress when the dimensionality of the projected space is smaller.

1Introduction

Every day we receive an enormous amount of data, information, and knowledge. Data becomes information when it is contextualised and related to a specific problem or solution. Data mining is an important part of knowledge discovery processes in various fields and sectors, such as medicine, economics, finance, telecommunications. Data mining helps uncover hidden information from vast amounts of data, which is valuable for recognising important facts, relationships, trends, and patterns. As data sets become increasingly large, more efficient ways of visualizing, analysing and interpreting the information they contain are needed. Comprehension of data is challenging, especially when the data relates to a complex object, a phenomenon that is described by many parameters, attributes, or features. Such data are called multidimensional, and the main goal is to make some visual insight into the data set being analysed. Visual information can be perceived much faster than textual information. For human perception, the multidimensional data must be represented in a low-dimensional space, usually two or three dimensions.

The main task in dimensionality reduction is to represent objects with a smaller set of more “compressed” features. Another reason for reducing the dimensionality or visualization of the data is to reduce the computational load for further data processing. Dimensionality reduction techniques enable the extraction of meaningful information hidden in the data. In addition, one of the most important purposes of data visualization is to get an idea of how close or far away certain points of the analysed multidimensional data are from each other.

Consider the multidimensional data set as an array X={Xi=(xi1,,xin), i=1,,m} of n-dimensional data points XiRn,n3. A data point Xi is the result of an observation of some object or phenomenon that depends on n features. Dimensionality reduction means finding a set of coordinates (locations) of points Yi=(yi1,,yid), i=1,,m, in a lower-dimensional space (d<n), where the particular point XiRn is represented by YiRd. If d3, then dimensionality reduction results can be presented visually for more convenient human decision-making.

Dimensionality reduction or data visualization techniques play an important role in machine learning (Murphy, 2022; Zhou, 2021; Ray et al., 2021; Dzemyda et al., 2013; Dos Santos and Brodlie, 2004; Buja et al., 2008). Such visualization is useful, especially in exploratory analysis: they provide insights into similarity relationships in high-dimensional data that would be unlikely to be obtained without visualization (Lee and Verleysen, 2007; Van Der Maaten et al., 2009; Markeviciute et al., 2022; Xu et al., 2019; Bernatavičienė et al., 2007; Kurasova and Molyte, 2011; Borg and Groenen, 2005; Dzemyda and Kurasova, 2006; Dzemyda et al., 2013; Jolliffe, 2002; Karbauskaitė and Dzemyda, 2015, 2016; Groenen et al., 1995). The classical dimensionality reduction methods for data visualization include linear principal component analysis (PCA) (Jolliffe, 2002; Jackson, 1991) and multidimensional scaling (MDS) (Torgerson, 1958; Borg and Groenen, 2005; Borg et al., 2018). PCA seeks to reduce the dimensionality of the data by finding orthogonal linear combinations (principal components) of the original variables with the highest variance (Jackson, 1991; Medvedev et al., 2011). The interpretation of principal components can sometimes be complex. PCA cannot cover non-linear structures consisting of arbitrarily shaped clusters or manifolds because it describes the data in terms of a linear subspace. Since global methods such as PCA and MDS cannot represent the local non-linear structure, there have been developed methods that preserve the raw local distances, i.e. the values of distances between nearest neighbours. Isomap, Local Linear Embedding (LLE), Hessian Local Linear Embedding and Laplacian Eigenmaps are traditional methods that attempt to preserve local Euclidean distances from the original space (Wang et al., 2021; Espadoto et al., 2021). More recent methods that focus on local structure preservation are t-Distributed Stochastic Neighbour Embedding (t-SNE) (Van der Maaten and Hinton, 2008) and UMAP (McInnes et al., 2018). A comprehensive review of dimensionality reduction methods is presented in Murphy (2022), Espadoto et al. (2021), Wang et al. (2021), Vachharajani and Pandya (2022), Dzemyda et al. (2013), Lee and Verleysen (2007), Van Der Maaten et al. (2009), Xu et al. (2019).

Artificial neural networks can also be applied to reduce dimensionality and visualize data (Dzemyda et al., 2007). A feedforward neural network is used for a topographic, structure-preserving, dimension-reducing transformation of data, with the additional ability to incorporate various degrees of related subjective information. There is also a neural network architecture developed specifically for topographic mapping, known as a Self-Organizing Map (SOM) (Kohonen, 2001; Stefanovic and Kurasova, 2011), which uses implicit lateral connections in the output layer of neurons. SOM is a technique used for both clustering and data dimensionality reduction (Dzemyda et al., 2007, 2013). In addition, there is a special learning rule, similar to backpropagation, which allows an ordinary feedforward artificial neural network to learn the Sammon mapping, which is a special case of metric MDS, in an unsupervised way. The learning rule for this type of neural network is known as SAMANN (Mao and Jain, 1995; Ivanikovas et al., 2007; Medvedev et al., 2011; Dzemyda et al., 2007).

An alternative view of dimensionality reduction is offered by multidimensional scaling (Borg and Groenen, 2005). Multidimensional scaling (MDS) is a classical non-linear approach that maps an original high-dimensional data set onto a lower-dimensional data set, but does so in an attempt to preserve the proximities between the corresponding data points. It is one of the most popular methods for multidimensional data visualization (Murphy, 2022; Borg et al., 2018; Dzemyda et al., 2013). Despite the fact that MDS demonstrates great versatility, it is computationally demanding. This can be challenging when the data amount increases. Traditional MDS approaches are limited when analysing very large data sets, as they require long computational time and large amounts of memory. Until now, there have been various studies aimed at creating a new solution or modifying the MDS to analyse large amounts of data and speed up the visualization process (Orts et al., 2019; Qiu and Bae, 2012; Pawliczek et al., 2014; Ingram et al., 2008; Medvedev et al., 2011; Ivanikovas et al., 2007).

The input data for MDS is a symmetric m×m matrix D={dij,i,j=1,,m} of proximities (similarities or dissimilarities) between the pairs of objects. More often, the dissimilarities are used to represent the proximities, however, the formulation of the multidimensional scaling problem and its solving methods remain the same in both cases. A lower dissimilarity value means that the objects are more similar. As dissimilarity, dij can be a distance between points Xi and Xj, i,j=1,,m (Dzemyda et al., 2013). The smaller the distance, the closer the points are.

One of the most popular algorithms for MDS is SMACOF (De Leeuw, 1977; De Leeuw and Mair, 2009). The algorithm is based on the majorization approach (Groenen et al., 1995), which iteratively replaces the original objective function with an auxiliary majorization function that is much easier to optimize. Majorization is not just an algorithm, it is more of a prescription for constructing optimization algorithms. The principle of majorization consists in constructing a auxiliary function that majorizes a certain objective function (De Leeuw and Mair, 2009). Experimental studies have shown that SMACOF is the most accurate algorithm compared to others (Orts et al., 2019; Ingram et al., 2008). There are a number of popular software implementations of SMACOF (De Leeuw and Mair, 2009; Pedregosa et al., 2011; Orts et al., 2019; MATLAB, 2012).

An alternative to SMACOF the Geometric MDS proposed in Dzemyda and Sabaliauskas (2020) and Dzemyda and Sabaliauskas (2021a), and developed in Sabaliauskas and Dzemyda (2021), Dzemyda and Sabaliauskas (2021b, 2021c), Dzemyda et al. (2022), Dzemyda and Sabaliauskas (2022). This will be discussed in more detail further in the paper. Figure 1 illustrates the visualization using Geometric MDS of 10, 50, and 100-dimensional datasets generated with Gaussian and Ellipsoidal cluster generators (Handl and Knowles, 2005). The dimensionality of the multidimensional data has been reduced to 2 (d=2). The Gaussian cluster generator is based on a standard cluster model using multivariate normal distributions, and the ellipsoidal cluster generator creates ellipsoidal clusters with the major axis at an arbitrary orientation. For each cluster, data points are generated at a Gaussian distributed distance from a uniformly random point on the major axis, in a uniformly random direction, and are rejected if they lie outside the boundary (Handl and Knowles, 2005). As the datasets are generated using cluster generators for large high-dimensional data sets with large numbers of clusters, some noisy data may be included. The example in Fig. 1 illustrates the use of visualization techniques to identify outliers, patterns in data in the form of clusters, relationships, and trends. The figure clearly shows clusters of analysed data and the location of individual multidimensional data among the rest.

Fig. 1

Examples of dimensionality reduction results obtained using Geometric MDS.

Examples of dimensionality reduction results obtained using Geometric MDS.

The novelty of this paper consists in the experimental investigation of parallelization of recently developed Geometric Multidimensional Scaling. The process of parallelization can lead to faster dimensionality reduction and data visualization process and to the possibility to process a large-scale multidimensional data. Theoretically proved properties of Geometric MDS are applied in parallelization. In addition, Geometric MDS is also experimentally compared with SMACOF, which is one of the most popular implementations of MDS. It was found that Geometric MDS is superior to the SMACOF in most cases.

2Geometric MDS: Multidimensional Scaling From a Geometric Point of View

MDS looks for coordinates of points Yi representing Xi in the lower-dimensional Euclidean space Rd by minimizing the stress function (Dzemyda et al., 2013). There are several MDS implementations with different stress functions (see review in Dzemyda et al., 2013). However, their minimization is rather complicated. In Dzemyda and Sabaliauskas (2020), Sabaliauskas and Dzemyda (2021), Dzemyda and Sabaliauskas (2021a, 2021c) a new approach, called Geometric MDS, with low computational complexity is proposed, making metric MDS applicable to large-scale data. This method is valuable and advantageous because the stress function and the multidimensional scaling have been considered from a geometrical point of view. The new interpretation of the stress function allows us to find the proper step size and direction of descent forward to the minimum of the stress function analytically, if we consider and move a single point of the projected space. A special property of the new approach is that there is no need for an analytical expression of the stress function. Even more, no linear search, which is used for local descent in optimization, is needed. Theoretical investigation showed that the step direction determined by Geometric MDS coincides with the direction of the steepest descent method (Dzemyda and Sabaliauskas, 2021a). The step size found analytically is such that it guarantees stress reduction in that direction.

A Geometric MDS has the advantage that it can use the simplest stress function, and there is no need to normalize it based on the number of data points and the scale of proximities. In Dzemyda and Sabaliauskas (2021a) it is shown that Geometric MDS does not depend on the scale of proximities and can therefore use a simple stress function, such as a raw stress function:

(1)
S(Y1,,Ym)=i=1mj=i+1m(dijdij)2,
where dij is the Euclidean distance between points Yi and Yj in a lower-dimensional space:
(2)
dij=l=1d(yilyjl)2.

The optimization problem is to find the minimum of the function S(·) (1), and the optimal coordinates of the points Yi=(yi1,,yid), i=1,,m:

(3)
minY1,,YmRdS(Y1,,Ym).

Let there be some initial configuration of points Y1,,Ym. The simplest way to minimize the stress S(·) by Geometric MDS is a consecutive changing of positions of separate points Y1,,Ym many times (iterations). This realization is denoted by GMDS1. One iteration of GMDS1 changes all points Yi=(yi1,,yid) in consecutive order when i runs from 1 to m once. Here we compute a new position Yi=(yi1,,yid) of Yi=(yi1,,yid) when the positions of remaining points Y1,,Yj1,Yj+1,,Ym are fixed, and update Yj by Yj then. So, Geometric MDS recalculates the coordinates of a single d-dimensional point Yj at each step, and one of its iterations consists of m steps. The result is a new point Yj. The stopping condition may be the number of iterations or the value of decrease of the stress function (1). Convergence of this algorithm to the local minimum of the stress is proved.

The core formula of Geometric MDS, when defining the transition from Yj to the new position Yj, is as follows:

(4)
Yj=1m1i=1ijmAij,
where the point Aij lies on the line between Yi and Yj, ij, at a distance dij from Yi (see Dzemyda and Sabaliauskas, 2020; Dzemyda and Sabaliauskas, 2021a; Sabaliauskas and Dzemyda, 2021):
(5)
Aij=Yi+(YjYi)dijdij.

Let’s consider a generalized way to update the entire set of points Y={Y1,,Ym} to Y={Y1,,Ym}. The main equation of Geometric MDS (4) is used for defining the transition from Yj to the new position Yj. The raw stress function (1) decreases after Yj goes to Yj, that is, moving any of the projected points by the Geometric MDS method reduces the stress (Dzemyda and Sabaliauskas, 2021a). But in the generalized scenario, all points Y1,,Ym change their coordinates to Y1,,Ym simultaneously and independently of each other during a single iteration of stress minimization. Therefore, we perform a simultaneous repositioning of all points Y1,,Ym of the projected space in the directions determined by the Geometric MDS strategy for individual points. We update Y1,,Ym by Y1,,Ym after all the points Y1,,Ym are computed, only. This realization is denoted by GMDSm. One iteration of GMDSm changes all points Yi=(yi1,,yid) simultaneously once. The convergence is proved theoretically in Dzemyda and Sabaliauskas (2022).

3Multi-Core Implementation of Geometric MDS

The GMDSm version of the Geometric MDS updates all points Y1,,Ym in a lower dimension simultaneously and independently of each other. This enables the use of parallel computing for updating each Yi independently, i.e. in parallel (see Algorithm 1). Such possibility of parallelization has been investigated further in this paper utilising multi-core processing. The computational resources of a single personal computer were used in this work. Experiments using different number of CPU threads were performed to obtain preliminary estimates of the efficiency of multi-core implementation of GMDSm.

Algorithm 1

A parallel Geometric MDS algorithm for simultaneous calculation of new points Yj in the projected space

A parallel Geometric MDS algorithm for simultaneous calculation of new points Yj∗ in the projected space

The purpose of the experiment is to compare the time required to compute new points Y1,,Ym in the projected space from Y1,,Ym, when using a sequential programming implementation of GMDSm with the computing power of one CPU core (single thread), and respectively when using a parallel algorithm using 2, 4, 6, 8, 10, 12 CPU threads. Here each point Yj is computed using Eq. (4), and all points Y1,,Ym change their coordinates to Y1,,Ym at once simultaneously and independently of each other during a single iteration. One iteration is a recalculation (optimal updating) of all points Yi=(yi1,,yid), i=1,,m once. The pseudo-code of the algorithm used for experiments is provided in Algorithm 1 (Dzemyda et al., 2022). Note that only the time taken to compute the new coordinates was measured during the experiments (see the part of Algorithm 1 that is marked as running in parallel). The California Housing dataset (Pace and Barry, 1997) was used for the experiments. The dataset consists of data collected in the 1990 California Census, it has 20640 examples of block groups, each containing an average of 1425.5 people living in a geographically compact area. There are 10 different features per example: 8 numeric attributes, one predictive attribute and the target. For preliminary experiments and due to limited computing resources, only part of the data was used (see Table 1), i.e. data of 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500 and 5000 observations were analysed, and the dimensionality of the multidimensional data was reduced to 2. This increase in data size reveals a trend which demonstrates that with further increases in data size, the use of parallel computing becomes more efficient. The experiments resulted in an average dependence of the time to calculate new point coordinates (without taking into account the full visualization process) on the size of the data when using different numbers of CPU threads for parallel processing. Each experiment was repeated 10 times to obtain more reliable data, so Table 1 presents the average value of the generalized data. For the experiments we used an AMD Ryzen 5 2600 processor, 4000 MHz (6 cores, 12 threads) with high multiprocessor performance, with 16GB of DDR4-2666 1333 MHz RAM, and the Python-Anaconda (Anaconda, 2022) environment was used as the software.

Table 1

Average dependence of the computation time of new coordinates (not considering the complete visualization process) on the data size when using different numbers of CPU threads for parallel processing, n=8; the dimensionality of multidimensional data was reduced to 2 (d=2).

Number of points, mSequential algorithmNumber of CPU threads (parallel processing)
24681012
1000.04450.64920.70180.83651.02461.24501.3315
5001.17701.17830.99021.03721.23841.44301.5087
10004.84552.92461.91481.70871.90912.04352.0459
150010.83245.84163.42092.88792.96473.02283.0353
200019.18129.88045.54904.34964.39214.40294.3415
250030.327415.38648.25816.27436.35506.25276.2111
300043.620321.561611.64508.76848.74798.60648.2878
350059.440128.980015.806711.560911.259311.054210.6939
400079.038637.788720.771215.105414.388014.311313.4704
450099.235147.871125.853518.806318.043317.404816.8381
5000124.646859.077831.716623.173022.351021.354920.6990
Fig. 2

Average dependence of the computation time of new coordinates (not taking into account the complete visualization process) on the size of the data, when using different numbers of CPU threads for parallel processing, n=8.

Average dependence of the computation time of new coordinates (not taking into account the complete visualization process) on the size of the data, when using different numbers of CPU threads for parallel processing, n=8.

The results obtained (see Table 1 and Fig. 2) demonstrate that even using two CPU threads for parallel processing, it is possible to achieve significant process speedup (on average up to 2 times) compared to sequential computing. The results obtained conclude that when using more than two processor threads for parallel computing, it is possible to increase the speed of the process of calculating new coordinates in the projected space on average up to 6 times. It is also worth noting that the difference in time increases with increasing the size of the analysed data, that is, the larger the size of the data used for visualization, the more efficient the process of parallel computing (see Fig. 2). It should be noted that these data were obtained as a result of initial experiments, using the computational power of a personal computer. Consider also that the CPU used for the experiments had 6 cores and 12 threads, so, as can be seen from the results, the efficiency of calculations using more than 6 threads is not high. The results are promising, which indicates the feasibility of using Geometric MDS to visualize large-scale multidimensional data using the computing power of a supercomputer or cluster for parallel computing.

4Sequential Implementation of Geometric MDS

Although MDS demonstrates great versatility, it is computationally expensive and not scalable, and requires handling the entire data distance or other proximity matrix. This can be challenging when we deal with large-scale data. In this section, we propose the optimized Python implementation of GMDSm for sequential computations.

MDS-type algorithms require to update a matrix of m×m of all pairwise Euclidean distances dij between points Yi and Yj, i.e. for m points a m×m matrix is required, each entry of which defines a non-negative distance between the pair of points. Calculations of Euclidean distance occur frequently in machine learning and computer vision (Albanie, 2019; Song et al., 2016). The element of the distance matrix is defined by (2).

However, as noted in Albanie (2019), the Euclidean distance matrix can be computed in another way, which is equivalent to (2):

(6)
(dij)2=(YiYj)(YiYj)T=Yi22YiYjT+Yj2,
where the first and last terms are the norms of vectors and can be calculated separately.

This method defined by (6) of calculating the Euclidean distance matrix has been implemented and used in Geometric MDS Python programming implementation. Applying (6) allows to speed up the visualization significantly because of much simpler calculations, and also allows to extend the dimensionality reduction for relatively large-scale data. Python function code for calculating distance matrix from a given m×d matrix Y=(Yi=(yi1,,yid),i=1,,m) is presented in Listing 1. This code can be used to compute a m×m distance matrix from m×n matrix X=(Xi=(xi1,,xin)), i=1,,m, too.

Listing 1

Python function code for calculating distance matrix.

Python function code for calculating distance matrix.
Listing 2

Python function code for calculating new points Y={Y1,,Ym} in a low-dimensional space using Geometric MDS.

Python function code for calculating new points Y∗={Y1∗,…,Ym∗} in a low-dimensional space using Geometric MDS.
Listing 3

Python function code for calculating new points Yˆ={Yˆ1,,Yˆm} in a low-dimensional space using SMACOF (optimized version of SMACOF, referred to as SMACOF-Fast).

Python function code for calculating new points Yˆ∗={Yˆ1∗,…,Yˆm∗} in a low-dimensional space using SMACOF (optimized version of SMACOF, referred to as SMACOF-Fast).

The algorithm in Listing 1 returns 105 values on the diagonal of the proximity matrix and in other situations when points among Xi and Xj coincide (the coincidence is only possible before Geometric MDS starts). This avoids dividing by zero for further calculations.

The Geometric MDS and SMACOF programming implementation code for Python has been optimized for computational speed. The Geometric MDS stress minimization method as well as the SMACOF implementation are featured by the fact that all points in low-dimensional space change their coordinates simultaneously and independently of each other during one iteration of stress minimization. We provide a function code implemented to calculate new coordinates of points in d-dimensional space in one iteration using Geometric MDS and SMACOF (see Listings 2 and 3), as well as the code of a fully working Python program that implements the sequential Geometric MDS algorithm (see Listing 4), where the stopping condition for stress (1) minimization is either the iteration number or the accuracy. In one iteration (see Listings 2 and 3), the coordinates of all points in the projected space are recalculated once. Listing 3 presents the function code for calculating new coordinates using SMACOF, adapted from the scikit-learn library (Pedregosa et al., 2011) and updated to improve computation speed (this implementation is denoted as SMACOF-Fast). In this implementation, the stress majorization, which is also known as the Guttman transform, guarantees monotonic stress convergence and is more powerful than traditional methods such as gradient descent.

Listing 4

Main program of Geometric MDS.

Main program of Geometric MDS.

Geometric MDS becomes the main competitor of SMACOF. We would like to highlight the main differences between the two algorithms. More detailed theoretical comparison of them is provided in Dzemyda and Sabaliauskas (2022).

At first, we define a Geometric MDS step from point Yj to Yj, j=1,,m that calculates new coordinates of point Yj:

(7)
Yj=Yj+1m1i=1m(YiYj)fij,fij=1dijdij,fii=0.
SMACOF is an iterative procedure that calculates the Guttman transform (Guttman, 1968) at each iteration and updates simultaneously the coordinates of m points Yj=(yj1,,yjd), j=1,,m:
(8)
Yˆ=1mB(Y)Y,
where Yˆj=(yˆj1,,yˆjd),j=1,,m are new points, Y and Yˆ are m×d matrices of coordinates of points Yj and Yˆj, respectively, B=B(Y) is m×m matrix with elements
(9)
Bij=dijdij,Bii=j=1jimBij,i,j=1,,m.
It is easy to check that SMACOF works very similarly to Geometric MDS. By calculating coordinates of point Yˆj, j=1,,m, (8) and (9) can be expressed as
(10)
Yˆj=1mi=1m(YjYi)dijdij.
It is proved in Dzemyda and Sabaliauskas (2022) that both Geometric MDS and SMACOF make a step from the point Yj to its new position Yj or Yˆj, respectively in the same direction with different step size. The step size by Geometric MDS is longer by mm1 times.

Geometric MDS can be expressed in matrix form as well:

(11)
Y=Y+1m1Bˆ(Y)Y,
where Yj=(yj1,,yjd),j=1,,m are new points, Y and Y are m×d matrices of coordinates of points Yj and Yj, respectively, Bˆ=Bˆ(Y) is m×m matrix with elements
(12)
Bˆij=1dijdij,Bˆii=j=1jimBˆij,i,j=1,,m.

4.1Computation Time Cost by Geometric MDS and SMACOF

The dimensionality reduction performance was compared using SMACOF (from the scikit-learn library, Pedregosa et al., 2011), SMACOF-Fast (Listing 3) and Geometric MDS algorithms (Listing 2). Experiments were carried out using:

  • randomly generated sets of 10-dimensional data points Xi=(xi1,,xid), i=1,,m, with coordinates uniformly distributed in the interval (0,1);

  • random proximity matrices with elements uniformly distributed in the interval (0,1).

Table 2

Average dependency of time required to compute new coordinates and Stress function values per iteration on data size when n=10, using SMACOF-Fast, SMACOF and Geometric MDS.

Ratio
Number of points, mSMACOFSMACOF-FastGeometric MDSSMACOF/SMACOF-FastGeometric MDS/SMACOF-Fast





StressTime, sStressTime, sStressTime, sStress ratioTime ratioStress ratioTime ratio
1011.60.0001911.60.0000311.70.000031.00000007.31529851.00952201.2097938
2066.70.0003666.70.0000566.70.000061.00000006.96844991.00110441.1193416
50479.10.00056479.10.00009478.80.000091.00000006.39831100.99925411.0824801
1002128.10.000932128.10.000162127.00.000171.00000005.98938830.99950291.0680442
2008946.60.001948946.60.000358944.10.000371.00000005.51883790.99972451.0489429
50051188.60.0088551188.60.0024651183.40.002671.00000003.59849860.99989691.0861585
1000218604.80.03845218604.80.01419218594.00.014521.00000002.70849070.99995051.0227548
2000894694.90.18151894694.90.06169894672.80.061751.00000002.94221450.99997531.0009971
50005106657.31.156215106657.30.323585106605.30.323221.00000003.57319570.99998980.9988961
1000021892993.05.0606921892993.01.2745221892881.61.262651.00000003.97065130.99999490.9906829
1500059857709.114.6661559857709.13.5142259857507.23.464311.00000004.17337690.99999660.9857980
20000127201601.732.07792127201601.77.49513127201281.27.450241.00000004.27983210.99999750.9940104
Table 3

Average dependence of the time required to compute new coordinates and Stress function values per iteration on the data size on random proximity matrices whose element takes uniformly distributed values in the interval (0, 1), using SMACOF-Fast, SMACOF and Geometric MDS.

Ratio
Number of points, mSMACOFSMACOF-FastGeometric MDSSMACOF/SMACOF-FastGeometric MDS/SMACOF-Fast





StressTime, sStressTime, sStressTime, sStress ratioTime ratioStress ratioTime ratio
103.8128070.000183.8128070.000033.7385810.000031.000000006.209222930.980532341.01140901
2022.022730.0003622.022730.0000621.814760.000061.000000006.345627980.990556671.02805624
50164.02610.00057164.02610.00010163.65710.000101.000000006.007063170.997749891.02165017
100788.32720.00112788.32720.00017787.70290.000171.000000006.750475620.999208001.02180998
2003295.7940.002303295.7940.000353294.9270.000351.000000006.498222510.999737030.99185505
50019294.970.0088819294.970.0027219293.70.002861.000000003.261749400.999933761.04896365
100083109.290.0383483109.290.0144783107.340.014151.000000002.650002730.999976520.97795314
2000339234.90.17323339234.90.05924339231.90.056651.000000002.924381580.999991040.95639967
500019406331.1561219406330.3054219406280.296821.000000003.785402520.999997160.97186003
1000083489425.4215483489421.2231683489321.232371.000000004.432403290.999998781.00752752
150002276537915.06911227653793.37158227653623.441931.000000004.469443950.999999261.02086564
200004840158832.37652484015887.35390484015627.417151.000000004.402631310.999999461.00860033

The dimensionality was reduced to d=2. After one iteration, the time taken to calculate the new coordinates of the point positions in the projected space was measured and the stress value S(·) defined by (1) was obtained. Data with different number of points m were used: 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000. In order to obtain more accurate and reliable results, each experiment was repeated 100 times. The results of these experiments are shown in Tables 2 and 3. Since the number of computational operations using Geometric MDS and SMACOF-Fast is approximately the same (it can be concluded from Listing 2 and Listing 3), the time required to compute new coordinates using Geometric MDS and SMACOF-Fast is about the same.

Fig. 3

Dependence of computation time on the number of points when reducing the dimensionality of 10-dimensional data (n=10) to d=2. The number of points (m=10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000) is given in a logarithmic scale. All results are averaged, as a result of each experiment being repeated 100 times to calculate one iteration.

Dependence of computation time on the number of points when reducing the dimensionality of 10-dimensional data (n=10) to d=2. The number of points (m=10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000) is given in a logarithmic scale. All results are averaged, as a result of each experiment being repeated 100 times to calculate one iteration.
Fig. 4

Dependence of computation time on the number of points when reducing the dimensionality of 10-dimensional data (n=10) to d=2. The time of calculation and the number of points (m=10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000) are given in a logarithmic scale. All results are averaged, as a result of each experiment being repeated 100 times to calculate one iteration.

Dependence of computation time on the number of points when reducing the dimensionality of 10-dimensional data (n=10) to d=2. The time of calculation and the number of points (m=10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000) are given in a logarithmic scale. All results are averaged, as a result of each experiment being repeated 100 times to calculate one iteration.

The dependency of the calculation time on a number of points using Geometric MDS (see Listing 2) and SMACOF (Pedregosa et al., 2011) is shown in Fig. 3. The number m of points (m=10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, and 20000) is given in a logarithmic scale. Each experiment was repeated 100 times, so the figures summarise the results with the average time and the value of the Stress function for each case of m. In order to provide the results in a more informative form, and to emphasize the differences in the time required to calculate new coordinates using Geometric MDS (see Listing 2) and SMACOF (Pedregosa et al., 2011), the time of calculation and the number of points are additionally given in a logarithmic scale in Fig. 4. The results demonstrate that Geometric MDS (Listing 2) outperforms the SMACOF (from the scikit-learn library, Pedregosa et al., 2011) in all cases in terms of the computation time required to recalculate the coordinates of new points Y1,,Ym in the projected space. Comparing the results obtained using SMACOF, SMACOF-Fast and Geometric MDS, it can be observed that with increasing data size Geometric MDS requires less computing time and the resulting stress is smaller than that by SMACOF under the same conditions (see Fig. 4, Tables 2 and 3). Meanwhile, the time required to recalculate new coordinates using Geometric MDS and SMACOF-Fast is approximately the same.

Fig. 5

Stress comparison between Geometric MDS and SMACOF depending on the number of points m and and the dimensionality of the projected space d.

Stress comparison between Geometric MDS and SMACOF depending on the number of points m and and the dimensionality of the projected space d.

4.2Comparative Efficiency of Geometric MDS and SMACOF

To compare which method, Geometric MDS or SMACOF, better reduces the MDS stress S(·) defined by (1), we tested these methods on random proximity matrices whose elements take uniformly distributed values in the interval (0,1). 101 tests were performed with different m=3,,1000 and d=1,,30. The experiments were carried out using the programming implementations provided earlier in this section: SMACOF-Fast based on Listing 3 and Geometric MDS based on Listing 2. The results of the experiments are given in Fig. 5, which visually shows which algorithm, SMACOF or Geometric MDS, get a smaller value of the stress function S(·) after updating the points Y1,,Ym only once, i.e. after one iteration, depending on the number m of points and the dimensionality d of the projected space. The red colour indicates cases in which SMACOF stress was superior to Geometric MDS in all cases, the light red indicates cases in which SMACOF stress was superior to Geometric MDS in more cases, the light green indicates cases in which Geometric MDS stress was superior to SMACOF in more cases, and the green indicates cases in which Geometric MDS stress was superior to SMACOF in all cases.

Figure 5 discloses a very interesting and essential difference between the performance of Geometric MDS and SMACOF. Geometric MDS finds smaller stress when the dimensionality of projected space is lower. This is particularly important for the multidimensional data visualization task.

Fig. 6

Stress comparison between Geometric MDS and SMACOF depending on the number of points m and the dimensionality n, when reducing the dimensionality to d=2.

Stress comparison between Geometric MDS and SMACOF depending on the number of points m and the dimensionality n, when reducing the dimensionality to d=2.

In Fig. 5, we present a curve that would allow us to choose either Geometric MDS or SMACOF in order to get a better stress after their first step and assuming that the values of the elements of the symmetric distance matrix are uniformly distributed in the interval (0,1). The upper part of the parabola was used to determine the distinguishing curve. The optimization package of the MAPLE software (Bernardin et al., 2021) was used to define the coefficients in the equation that is as follows in this particular case:

f(m)0.003646m+0.362826m3+6.782796.

Particular case is d=2, because it is the most used in multidimensional data visualization. Figure 6 illustrates the values of the stress function S(·) depending on the number of points m and the dimensionality n of the analysed multidimensional data. By random generating a set of m n-dimensional points X1,,Xm, the distance matrix m×m was calculated. Then a random set of 2-dimensional (d=2) points Y1,,Ym was selected as a starting position for optimization of the stress. The coordinates of these points were recalculated in one iteration using Geometric MDS and SMACOF (see Listings 2 and 3). The colours in Fig. 6 were labelled using the same rules as in Fig. 5: red means that in 101 such independent trials, SMACOF gave better stress in all cases, light red means that SMACOF gave better stress in more cases than Geometric MDS, light green means that Geometric MDS gave better stress in more cases than Geometric MDS but not in all cases, and light green means that Geometric MDS gave better stress in all cases. Figure 6 shows visually whether Geometric MDS or SMACOF gives lower stress depending on the number of points m and on the dimensionality n after one iteration of 101 tries. The results in Fig. 6 can be useful for choosing an appropriate method when the dimensionality n of the data to be analysed and the number of points n are known.

5Conclusions

A new Geometric MDS method with the low computational complexity has been proposed in Dzemyda and Sabaliauskas (2021a, 2022). Geometric MDS provides an iterative procedure to minimize MDS stress where coordinates of a particular point of the projected space are moved to a new position defined analytically. Such a change in position is easily interpreted geometrically. Moreover, the coordinates of points of the projected space may be recalculated simultaneously, i.e. in parallel, independently of each other during one iteration of stress minimization. Two implementations of Geometric MDS are suggested and analysed experimentally: parallel and sequential.

A parallel implementation of Geometric MDS is developed for dimensionality reduction using multithreaded multi-core processors. Based on the results obtained, we can claim that Geometric MDS allows to implement parallel computing using multithreaded multi-core processors, as a result, the time to calculate new coordinates of points in the low-dimensional space may be reduced by 6 times on average, depending on the data size.

The sequential implementation given in this paper is optimized for computational speed, enabling it to solve large data problems. It is compared with the SMACOF version of MDS. Python codes for both Geometric MDS and SMACOF are presented to highlight the differences between the two implementations. These codes were optimized for computational speed. The comparison was carried out on several aspects: the comparative performance of Geometric MDS and SMACOF depending on the projection dimension, data size and computation time. Geometric MDS usually finds lower stress when the dimensionality of the projected space is smaller.

SMACOF is an iterative procedure that calculates the Guttman transform. We discovered that the procedure of updating lower-dimensional points by Geometric MDS is a generalization of the Guttman transform. If we apply 1m instead of 1m1 in Eq. (7) to all the lower-dimensional points Y1,,Ym simultaneously and independently of each other during a single iteration of stress minimization, we get the same result as with the Guttman transform. However, it is proved in Dzemyda and Sabaliauskas (2022) that both Geometric MDS (GMDSm) and SMACOF make a step from the point Yj to its new position in the same direction with different step size. The step size by Geometric MDS is mm1 times longer. In both cases, the stress (1) decreases. Moreover, Geometric MDS allows to update coordinates of a single point Yj, and it guarantees the decrease of the stress, too. These are the reasons for further exploring the potential of Geometric MDS and applying the advantages discovered.

References

1 

Albanie, S. (2019). Euclidean Distance Matrix Trick. Retrieved from Visual Geometry Group, University of Oxford.

2 

Anaconda (2022). Anaconda Software Distribution. Anaconda Inc. https://docs.anaconda.com/.

3 

Bernardin, L., Chin, P., DeMarco, P., Geddes, K.O., Hare, D., Heal, K., Labahn, G., May, J., McCarron, J., Monagan, M. ((2021) ). Maple Programming Guide. Maplesoft, a division of Waterloo Maple Inc., Waterloo, Ontario.

4 

Bernatavičienė, J., Dzemyda, G., Kurasova, O., Marcinkevičius, V., Medvedev, V. ((2007) ). The problem of visual analysis of multidimensional medical data. In: Models and Algorithms for Global Optimization. Springer, Boston, MA, pp. 277–298. https://doi.org/10.1007/978-0-387-36721-7_17.

5 

Borg, I., Groenen, P.J. ((2005) ). Modern Multidimensional Scaling: Theory and Applications. Springer Science & Business Media, New York, NY 100013, USA.

6 

Borg, I., Groenen, P.J., Mair, P. ((2018) ). Applied Multidimensional Scaling and Unfolding, 2nd ed. Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-319-73471-2.

7 

Buja, A., Swayne, D.F., Littman, M.L., Dean, N., Hofmann, H., Chen, L. ((2008) ). Data visualization with multidimensional scaling. Journal of Computational and Graphical Statistics, 17: (2), 444–472. https://doi.org/10.1198/106186008X318440.

8 

De Leeuw, J. ((1977) ). Application of convex analysis to multidimensional scaling. In: Barra, J.R., Brodeau, F., Romier, G., Van Cutsem, B. (Eds.), Recent Developments in Statistics. North Holland PublishingCompany, Amsterdam, pp. 133–145.

9 

De Leeuw, J., Mair, P. ((2009) ). Multidimensional scaling using majorization: SMACOF in R. Journal of Statistical Software, 31: (3), 1–30. https://doi.org/10.18637/jss.v031.i03.

10 

Dos Santos, S., Brodlie, K. ((2004) ). Gaining understanding of multivariate and multidimensional data through visualization. Computers & Graphics, 28: (3), 311–325. https://doi.org/10.1016/j.cag.2004.03.013.

11 

Dzemyda, G., Kurasova, O. ((2006) ). Heuristic approach for minimizing the projection error in the integrated mapping. European Journal of Operational Research, 171: (3), 859–878. https://doi.org/10.1016/j.ejor.2004.09.011.

12 

Dzemyda, G., Sabaliauskas, M. ((2020) ). A novel geometric approach to the problem of multidimensional scaling. In: Sergeyev, Y.D., Kvasov, D.E. (Eds.), Numerical Computations: Theory and Algorithms, NUMTA 2019. Lecture Notes in Computer Science, Vol. 11974: . Springer, Cham, pp. 354–361. https://doi.org/10.1007/978-3-030-40616-5_30.

13 

Dzemyda, G., Sabaliauskas, M. ((2021) a). Geometric multidimensional scaling: a new approach for data dimensionality reduction. Applied Mathematics and Computation, 409: , 125561. https://doi.org/10.1016/j.amc.2020.125561.

14 

Dzemyda, G., Sabaliauskas, M. ((2021) b). New capabilities of the geometric multidimensional scaling. In: WorldCIST 2021. Advances in Intelligent Systems and Computing. Trends and Applications in Information Systems and Technologies, Vol. 1366: . Springer, Cham, pp. 264–273. https://doi.org/10.1007/978-3-030-72651-5_26.

15 

Dzemyda, G., Sabaliauskas, M. ((2021) c). On the computational efficiency of geometric multidimensional scaling. In: 2021 2nd European Symposium on Software Engineering, ESSE 2021. Association for Computing Machinery, New York, NY, USA, pp. 136–141. 9781450385060. https://doi.org/10.1145/3501774.3501794.

16 

Dzemyda, G., Sabaliauskas, M. (2022). Geometric multidimensional scaling: efficient approach for data dimensionality reduction. Journal of Global Optimization. https://doi.org/10.1007/s10898-022-01190-8.

17 

Dzemyda, G., Kurasova, O., Medvedev, V. ((2007) ). Dimension reduction and data visualization using neural networks. In: Frontiers in Artificial Intelligence and Applications, Emerging Artificial Intelligence Applications in Computer Engineering, Vol. 160: , pp. 25–49.

18 

Dzemyda, G., Kurasova, O., Žilinskas, J. ((2013) ). Multidimensional Data Visualization: Methods and Applications. Springer Optimization and its Applications, Vol. 75: . Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0236-8.

19 

Dzemyda, G., Medvedev, V., Sabaliauskas, M. ((2022) ). Multi-Core Implementation of Geometric Multidimensional Scaling for Large-Scale Data. In: Information Systems and Technologies. WorldCIST 2022, Lecture Notes in Networks and Systems, Vol. 469: . Springer International Publishing, Cham, pp. 74–82. https://doi.org/10.1007/978-3-031-04819-7_8.

20 

Espadoto, M., Martins, R.M., Kerren, A., Hirata, N.S.T., Telea, A.C. ((2021) ). Toward a quantitative survey of dimension reduction techniques. IEEE Transactions on Visualization and Computer Graphics, 27: (3), 2153–2173. https://doi.org/10.1109/TVCG.2019.2944182.

21 

Groenen, P.J., Mathar, R., Heiser, W.J. ((1995) ). The majorization approach to multidimensional scaling for Minkowski distances. Journal of Classification, 12: (1), 3–19. https://doi.org/10.1007/BF01202265.

22 

Guttman, L. ((1968) ). A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrica, 33: , 469–506. https://doi.org/10.1007/BF02290164.

23 

Handl, J., Knowles, J. ((2005) ). Cluster generators for large high-dimensional data sets with large numbers of clusters. Dimension, 2: , 20. https://personalpages.manchester.ac.uk/staff/Julia.Handl/data.tar.gz.

24 

Ingram, S., Munzner, T., Olano, M. ((2008) ). Glimmer: multilevel MDS on the GPU. IEEE Transactions on Visualization and Computer Graphics, 15: (2), 249–261. https://doi.org/10.1109/TVCG.2008.85.

25 

Ivanikovas, S., Medvedev, V., Dzemyda, G. ((2007) ). Parallel realizations of the SAMANN algorithm. In: International Conference on Adaptive and Natural Computing Algorithms. Lecture Notes in Computer Science, Vol. 4432: . Springer, pp. 179–188. https://doi.org/10.1007/978-3-540-71629-7_21.

26 

Jackson, J.E. ((1991) ). A User’s Guide to Principal Components, Vol. 587: . John Wiley & Sons, Hoboken, NJ. https://doi.org/10.1002/0471725331.

27 

Jolliffe, I. ((2002) ). Principal Component Analysis, second edition. Springer-Verlag, New York, Berlin, Heidelberg.

28 

Karbauskaitė, R., Dzemyda, G. ((2015) ). Optimization of the maximum likelihood estimator for determining the intrinsic dimensionality of high-dimensional data. International Journal of Applied Mathematics and Computer Science, 25: (4), 895–913. https://doi.org/10.1515/amcs-2015-0064.

29 

Karbauskaitė, R., Dzemyda, G. ((2016) ). Fractal-based methods as a technique for estimating the intrinsic dimensionality of high-dimensional data: a survey. Informatica, 27: (2), 257–281. https://doi.org/10.15388/Informatica.2016.84.

30 

Kohonen, T. ((2001) ). Self-Organizing Maps, 3rd ed. Springer Series in Information Sciences. Springer-Verlag, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56927-2.

31 

Kurasova, O., Molyte, A. ((2011) ). Quality of quantization and visualization of vectors obtained by neural gas and self-organizing map. Informatica, 22: (1), 115–134. https://doi.org/10.15388/informatica.2011.317.

32 

Lee, J.A., Verleysen, M. ((2007) ). Nonlinear Dimensionality Reduction. Springer Science & Business Media, New York, NY. https://doi.org/10.1007/978-0-387-39351-3.

33 

Mao, J., Jain, A.K. ((1995) ). Artificial neural networks for feature extraction and multivariate data projection. IEEE Transactions on Neural Networks, 6: (2), 296–317. https://doi.org/10.1109/72.363467.

34 

Markeviciute, J., Bernataviciene, J., Levuliene, R., Medvedev, V., Treigys, P., Venskus, J. ((2022) ). Attention-based and time series models for short-term forecasting of COVID-19 spread. Computers, Materials and Continua, 70: (1), 695–714. https://doi.org/10.32604/cmc.2022.018735.

35 

MATLAB ((2012) ). MATLAB and Statistics Toolbox Release 2012b. The MathWorks Inc., Natick, Massachusetts, United States.

36 

McInnes, L., Healy, J., Saul, N., Großberger, L. ((2018) ). UMAP: uniform manifold approximation and projection. Journal of Open Source Software, 3: (29), 861. https://doi.org/10.21105/joss.00861.

37 

Medvedev, V., Dzemyda, G., Kurasova, O., Marcinkevičius, V. ((2011) ). Efficient data projection for visual analysis of large data sets using neural networks. Informatica, 22: (4), 507–520. https://doi.org/10.15388/informatica.2011.339.

38 

Murphy, K.P. ((2022) ). Probabilistic Machine Learning: An Introduction. MIT Press, Cambridge, Massachusetts.

39 

Orts, F., Filatovas, E., Ortega, G., Kurasova, O., Garzón, E.M. ((2019) ). Improving the energy efficiency of SMACOF for multidimensional scaling on modern architectures. The Journal of Supercomputing, 75: (3), 1038–1050. https://doi.org/10.1007/s11227-018-2285-x.

40 

Pace, R.K., Barry, R. ((1997) ). Sparse spatial autoregressions. Statistics & Probability Letters, 33: (3), 291–297. https://doi.org/10.1016/s0167-7152(96)00140-x.

41 

Pawliczek, P., Dzwinel, W., Yuen, D.A. ((2014) ). Visual exploration of data by using multidimensional scaling on multicore CPU, GPU, and MPI cluster. Concurrency and Computation: Practice and Experience, 26: (3), 662–682. https://doi.org/10.1002/cpe.3027.

42 

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. ((2011) ). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12: , 2825–2830.

43 

Qiu, J., Bae, S.-H. ((2012) ). Performance of windows multicore systems on threading and MPI. Concurrency and Computation: Practice and Experience, 24: (1), 14–28. https://doi.org/10.1002/cpe.1762.

44 

Ray, P., Reddy, S.S., Banerjee, T. ((2021) ). Various dimension reduction techniques for high dimensional data analysis: a review. Artificial Intelligence Review, 54: (5), 3473–3515. https://doi.org/10.1007/s10462-020-09928-0.

45 

Sabaliauskas, M., Dzemyda, G. ((2021) ). Visual analysis of multidimensional scaling using GeoGebra. In: Dzitac, I., Dzitac, S., Filip, F.G., Kacprzyk, J., Manolescu, M.-J., Oros, H. (Eds.), Intelligent Methods in Computing, Communications and Control. Springer International Publishing, Cham, pp. 179–187. https://doi.org/10.1007/978-3-030-53651-0_15.

46 

Song, H.O., Xiang, Y., Jegelka, S., Savarese, S. ((2016) ). Deep metric learning via lifted structured feature embedding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 4004–4012. https://doi.org/10.1109/CVPR.2016.434.

47 

Stefanovic, P., Kurasova, O. ((2011) ). Visual analysis of self-organizing maps. Nonlinear Analysis-Modelling and Control, 16: (4), 488–504. https://doi.org/10.15388/NA.16.4.14091.

48 

Torgerson, W.S. ((1958) ). Theory and Methods of Scaling. John Wiley, Oxford, England.

49 

Vachharajani, B., Pandya, D. (2022). Dimension reduction techniques: current status and perspectives. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.12.549.

50 

Van der Maaten, L., Hinton, G. ((2008) ). Visualizing data using t-SNE. Journal of Machine Learning Research, 9: (11), 2579–2605.

51 

Van Der Maaten, L., Postma, E., Van den Herik, J. ((2009) ). Dimensionality reduction: a comparative review. Journal of machine learning research, 10: (13), 66–71.

52 

Wang, Y., Huang, H., Rudin, C., Shaposhnik, Y. ((2021) ). Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization. Journal of Machine Learning Research, 22: (201), 1–73.

53 

Xu, X., Liang, T., Zhu, J., Zheng, D., Sun, T. ((2019) ). Review of classical dimensionality reduction and sample selection methods for large-scale data processing. Neurocomputing, 328: , 5–15. https://doi.org/10.1016/j.neucom.2018.02.100.

54 

Zhou, Z.-H. ((2021) ). Dimensionality reduction and metric learning. In: Machine Learning. Springer, Singapore, pp. 241–264. https://doi.org/10.1007/978-981-15-1967-3_10.