Bell Laboratories  
  Statistics and Learning Research Department  
  Home Page  
  Home Page  
Tin Kam Ho
Statistics and Learning Research Department
Bell Laboratories, Alcatel-Lucent
Room MH 2C-381, 600 Mountain Ave
Murray Hill, NJ 07974, USA
Research Activities

I believe deep insights that are critical for significant algorithmic advances can only come from intensive studies of real-world data originating in carefully chosen domains, and must undergo tests in serious applications. Over the years I have been following such a path.

On the algorithm side, I studied some geometrical and nonparametric statistical methods applicable to problems in very high dimensional feature spaces, and combinatorial approaches to classification. My topics include distribution maps, random decision forests, stochastic discrimination, and combination and coordination of multiple classifiers. These algorithms are applicable to classification tasks in any domain, such as image and speech processing, digital libraries, information retrieval, medical diagnosis, scientific data analysis, and financial engineering. Recently I have been looking into ways for characterizing the complexity of classification problems and relating that to classifier behavior, and finding unifying themes in various methods for unsupervised learning.

On the application side, I am looking into a number of data analysis problems covering several areas of science and engineering. The problems originated from network traffic analysis, telecommunication engineering, multimedia information processing, computational physics, and astronomy. Generally they involve modeling, visualization, and retrieval of numerical data in very high dimensional spaces. I seek to understand and meet with the unique challenges imposed by each application area, and develop algorithms and practical tools for both interactive and automated analyses.

Before, I worked on optical character recognition, concentrating on developing symbol classifiers and contextual analysis methods. I carried out several large-scale simulation studies to address issues like estimation of intrinsic error rate, asymptotic accuracy of classifiers, and systematic evaluation of classifiers. I also studied recognition strategies that exploit the contextual information in a text page. These include word-based recognition methods, and image enhancement by clustering and averaging. Later I focused on adaptive methods, like font learning by identifying stop words, and text recognition without shape training . I also studied text categorization as a way to organize documents for an information retrieval system.

Complementary to analysis of real-world observations is the attempt to model and simulate the physical processes that generate the data. I pursue an interest in this as well. A recent integration of these interests led to my use of pattern recognition methods in analyzing the classical mathematical models that describe complex photonics systems.

Data Complexity in Pattern Recognition
My recent works on exploring the intrinsic limits of pattern learning are collected in our book on Data Complexity in Pattern Recognition.

One of my recent projects is Mirage, an experimental tool for interactive pattern recognition. It is available for download.

Another major project I worked on is FROG, a physical layer simulator for optical network systems. There we calculate how light changes within trans-continental fiber links connected with many amplification devices.

A Challenge
Is human necessarily superior to machines in perceptual tasks? If you believe so, try taking this challenge. Can you read the text printed in this image? If so, tell me what it is and how long it takes you to find out. If not, my program can do better than you, and it did the job in just a few minutes. Find out how.

My papers.

My patents.

Slides of some talks.