The EM Algorithm and its Extensions
Since it is simple and stable,
the EM algorithm
(Dempster, Laird, and Rubin, 1977, JRSSB)
has been widely used to fit models from
incomplete data.
Our current research program
in this area includes the following.
 Acceleration
 The PXEM
algorithm (Chuanhai Liu,
Donald B. Rubin, and
Ying Nian Wu, 1998),
shares the simplicity and stability of ordinary EM but
is often much faster. The intuitive idea is to use a covariance
adjustment to correct the M step, capitalizing on
extra information captured in the imputed complete data. This
is accomplished by parameter expansion; we expand
the completedata model while
preserving the observeddata model and use the expanded
completedata model in the EM algorithm.
 Supplementation

Computing the Information Matrix
from conditional information via normal approximation (Liu, 1998).
The basic idea is to approximate the likelihood function by a normal density
when maximum likelihood estimates are assumed to be approximately normally
distributed. The method uses two facts: the information
for a onedimensional parameter can be computed when the loglikelihood
is approximately quadratic over a range that corresponds to
a small positive confidence interval; and the covariance matrix of
a normal distribution can be obtained from a set of onedimensional
conditional distributions whose sample spaces
span the sample space of the joint distribution.
 Application
 EM can be used for maximum likelihood estimation of
many models, such as multivariate normal,
multivariate t,
mixedeffects,
general location,
factor analysis, and
mixture
models. For example,
the EM algorithm has been used in understanding and
modelling the relationship among questions/attributes at company level in
Customer Value Analysis (CVA).
William S. Cleveland and Chuanhai Liu are working on a generalized version
of the time series model that has been used as a component in modeling CVA.
We implemented the EM algorithm for maximum likelihood estimation
of this class of time series models.
 As a supplementary tool for
Markov chain Monte Carlo (MCMC)
methods for
Bayesian computation.
The postscript files of some
our current papers are also available.
Back to: [projects]
[statistics homepage]