UWEETR-2005-0003 Author(s): Keywords: Abstract Automatic annotation of prosodic events could help improve speech understanding and synthesis. However, little annotated data is available for training prosody models because hand-labeling is prohibitively expensive. To address this issue, we explore weakly supervised learning techniques (EM, co-training, and self-training with bagging) that use only a small amount of hand-labeled data in combination with a large unlabeled data set with syntactic parses. Experiments on conversational speech show improved performance of decision trees on labeling symbolic prosodic events, specifically break indices and pitch accents. |