Institute of Computer Science III
University of Bonn
Databases * Information Systems * Software Engineering * Pattern Recognition * Image Processing * Artificial Intelligence * Robotics


Pairwise Data Clustering by Deterministic Annealing

Thomas Hofmann, Joachim M. Buhmann

Revised version from 14.3.1996. If you rely at the original version, please contact jan@cs.uni-bonn.de

Partitioning a data set and extracting hidden structure from the data arises in different application areas of pattern recognition and image processing. Pairwise data clustering is a combinatorial optimisation method for data grouping which extracts structure from proximity data. We descibe a deterministic annealing approach to pairwise data clustering which shares the robustness properties of maximum entropy inference. The resulting Gibbs probability distributions are estimated by meanfield approximation, a well-known technique from statistical physics. A new algorithm to group dissimilarity data and to simultanously embed these data in a Euclidian vector space is discussed which can be used for dimension reduction and data visualisation. The suggested algorithms have been implemented to analyse dissimilarity data from protein analysis and from linguistics. Furthermore, pairwise data clustering is used to segment textured images.

Click here to obtain the full paper (PS, gzip, 1058748 bytes, 29 pages)


webmaster@www.informatik.uni-bonn.de - 16.12.05