UWEE Tech Report Series

Using Weakly Supervised Learning to Improve Prosody Labeling


UWEETR-2005-0003

Author(s):
D. Wong, M. Ostendorf and J. Kahn

Keywords:
weakly supervised learning, prosody, conversational speech, prosodic breaks, prominence, pitch accent, EM training, decision tree, co-training, self-training, bagging

Abstract

Automatic annotation of prosodic events could help improve speech understanding and synthesis. However, little annotated data is available for training prosody models because hand-labeling is prohibitively expensive. To address this issue, we explore weakly supervised learning techniques (EM, co-training, and self-training with bagging) that use only a small amount of hand-labeled data in combination with a large unlabeled data set with syntactic parses. Experiments on conversational speech show improved performance of decision trees on labeling symbolic prosodic events, specifically break indices and pitch accents.

Download the PDF version

Download the Gzipped Postscript version