Clustering-Based Behavioural Analysis of Biological Objects

Arnis Kirshners


The article examines the problem of processing short time series for bioinformatics tasks using data mining methods in the field of pharmacology. The experiments were conducted using heart contraction (contraction and relaxation) power data that were obtained in experiments with laboratory animals with the goal of registering the power changes of heart contractions in different stages of experiment in a given period of time. The selected data were treated using data preprocessing technologies. The short time series were compared using various time-point similarity search methods using agglomerative hierarchical clustering, k- means clustering, modified k-means clustering and expectation-maximization clustering algorithms. Based on the clustering result evaluation the most suitable algorithm was chosen and the optimal number of clusters was determined for the least clustering error. The acquired clusters were used for to create cluster prototypes that aggregate the groups of similar heart contraction power objects. The article offers an examination of the errors produced by algorithms and methods as well as a discussion of the obtained clustering results using different evaluation methodologies. It also gives conclusions about the application of data mining methods in solving bioinformatics tasks and outlines further research directions.


clustering short time series; clustering algorithms; cluster prototypes

Full Text:



Liepinsh E., Vilskersts R., Skapare E., Svalbe B., Kuka J., Cirule H. et al. Mildronate decreases carnitine availability and up-regulates glucose uptake and related gene expression in the mouse heart. Life Sci, 2008, 83: 613–619.

Liepinsh E., Vilskersts R., Zvejniece L., Svalbe B., Skapare E., Kuka J. et al. Protective effects of mildronate in an experimental model of type 2 diabetes in Goto-Kakizaki rats. British Journal of Pharmacology, 2009, 157: 1549–1556.

Kirshners A., Sukov A. Rule induction for forecasting transition points in product life cycle data. Scientific Proceedings of Riga Technical University, Information Technology and Management Science, Issue 5, Vol.36, RTU, Riga, 2008, p. 170-177.

Kirshners A., Parshutin S., Borisov A. Combining clustering and a decision tree classifier in a forecasting task. Automatic Control and Computer Sciences, Vol.44, N3, 2010, p. 124-132.

Thomassey S., Fiordaliso A. A hybrid sales forecasting system based on clustering and decision trees. Decision Support Systems, Vol.42, Issue 1, 2006, p. 408-421.

Tan P. N., Steinbach M., Kumar V. Introduction to Data Mining. – Boston: Addison-Wesley, 2006, 769 p.

Witten I.H., Frank E. Data mining: Practical Machine Learning Tools and Techniques, 2nd edition. – Amsterdam etc.: Morgan Kaufman, 2005, 525 p.

Dellaert F. The Expectation Maximization Algorithm. College of Computing, Georgia Institute of Technology, Technical Report number GIT-GVU-02-20, Feb., 2002.

McLachlan, G. and Krishnan, T. The EM Algorithm and Extensions. Wiley series in probability and statistics. John Wiley & Sons, 1997.

Montgomery D. C., Jennings C. L., Kulachi M. Introduction to Time Series Analysis and Forecasting. Wiley-interscience, 2008, 472 p.



  • There are currently no refbacks.

SCImago Journal & Country Rank