UWEE Tech Report Series

Graphical Models and Automatic Speech Recognition


UWEETR-2001-0005

Author(s):
Jeff A. Bilmes

Keywords:
graphical models, bayesian networks, automatic speech recognition, hidden Markov models, pattern recognition, pattern classification, language processing

Abstract

Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition. This paper provides a general overview of graphical models and their uses as statistical models. It is shown that the underlying statistical assumptions behind many pattern recognition techniques commonly used for speech recognition can be described by a graph -- this includes Gaussians distributions, mixture models, decision trees, factor analysis, principle component analysis, linear discriminant analysis, and hidden Markov models. Moreover, this paper shows that many advanced models proposed for speech recognition and language processing can also be described by a graph. A number of speech recognition techniques born directly out of the graphical-models paradigm are also surveyed. In conducting this survey, it becomes apparent that the space of models describable by a graph is enormous. As will be seen, it seems quite probable that a thorough exploration of the space of models easily describable by a graph would yield ultimately a model that performs far better than the HMM. Given the overview presented in this paper, it will be easier to begin such an endeavor.

Download the PDF version

Download the Gzipped Postscript version