Affiliations: Department of Mathematics, University of California,
San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA. E-mail:
[email protected] | Sidney Kimmel Cancer Center, 10835 Altman Row, San
Diego, CA 92121, USA. E-mail: [email protected]
Note: [] Ccorresponding author
Abstract: Discerning significant relationships in small data sets remains
challenging. We introduce here the Hamming distance matrix and show that it is
a quantitative classifier of similarities among short time-series. Its elements
are derived by computing a modified form of the Hamming distance of pairs of
symbol sequences obtained from the original data sets. The values from the
Hamming distance matrix are then amenable to statistical analysis. Examples
from stem cell research are presented to illustrate different aspects of the
method. The approach is likely to have applications in many fields.