|
Clustering and Classification via Lossy Compression |
|||||||||||||||||||||
© 2007 |
Welcome!
This website introduces a new mathematical framework for two related classical problems in statistical learning: data clustering and data classification. The basic idea is to cast data clustering or classification as a (lossy) data compression problem. By minimizing the coding length of the data, the resulting clustering and classification solutions are provably optimal and the proposed algorithms are extremely simple, efficient, and robust. They can achieve the state of the art clustering and classification results, especially for high-dimensional data such as images or gene expression data. This page gives a visual introduction to some representative problems that can be solved by this new method. Please click the links at left for a basic introduction to the new approach, or complete technical details in the reference papers, or sample code for all the algorithms, or implementation details on some real-world applications.
Real data such as images and videos often have such complicated, mixed structure. Segmentation breaks the image or video into smaller, simpler pieces making it easier to represent, understand or process:
Given a set or training examples whose identity is known, determine the identity of a new sample:
One of the most prominent applications of this fundamental problem is in automatic face recognition, where the training examples images of human faces, taken under varying expression, pose, or lighting conditions. The test example is an image of a face, possibly taken under different pose, expression or lighting condition. The task is to identify which of the individuals in the training database is pictured in the test image:
To read more about the classification problem and its solution, please refer to our introduction page, or to our paper, Classification via Minimum Incremental Coding Length (MICL) on the references page. Credits This website is developed and maintained by the research group of professor Yi Ma at the University of Illinois at Urbana-Champaign. This material is based upon work supported by the National Science Foundation under Award No. IIS-0347456, CRS-EHS-0509151, and CCF-TF-0514955, and by the Office of Naval Research, under Award No. N0001405-1-0633. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation or Office of Naval Research. Please direct all questions and comments about this website to the webmaster: John Wright (jnwright@uiuc.edu). |
||||||||||||||||||||
|
|
|||||||||||||||||||||
|
|
|||||||||||||||||||||