Affiliations: Institute of Cytology and Genetics SB RAS, Acad.
Lavrentiev ave., 10, Novosibirsk, 630090, Russia Tel.: +7 3832 332971; Fax: +7
3832 331278 | Sobolev Institute of Mathematics SB RAS, Acad. Koptyug
prospect, 4, Novosibirsk, 630090, Russia. E-mail: [email protected],
[email protected], [email protected], [email protected]
Abstract: A method has been developed for constructing a tree source model for
genetic text generation. Model visualisation in the form of suffix (context)
trees provides a new way of context analysis of symbol sequences. Estimation of
the stochastic complexity of the data in the frame of the model serves as a
criterion for the model's ascertainment. The model and complexity values are
used for analysis of genetic texts. The software realisation of this algorithm
enables to reveal statistical properties of genetic sequences based on an
information measure. The program developed is available via Internet at
http://wwwmgs.bionet.nsc.ru/mgs/programs/complexity/.
Keywords: complexity, information measure, suffix tree visualisation, variable memory Markov model, genetic texts, statistical modelling