UWEE Tech Report Series

Semi-Supervised Learning with Measure Propagation


UWEETR-2010-0004

Author(s):
Amar Subramanya, Jeff Bilmes

Keywords:
semi-supervised learning, convexity, convex optimization, alternating minimization, large scale machine learning, SSL

Abstract

We propose a new objective for graph-based semi-supervised learning based on minimizing the Kullback-Leibler divergence between discrete probability measures that encode class membership probabilities. We show how the proposed objective can be eciently optimized using alternating minimization. We prove that the alternating minimization procedure converges to the correct optimum and derive a simple test for convergence. In addition, we show how this approach can be scaled to solve the semi-supervised learning problem on very large data sets, e.g., in one instance we use a data set with over 108 samples. In this context, we propose a graph node ordering algorithm that is also applicable to other graph-based SSL approaches. We compare the proposed approach against other standard SSL algorithms on the SSL benchmark data sets [17], and other real-world tasks such as text classi cation on Reuters and WebKB, speech phone classi cation on TIMIT and Switchboard, and linguistic dialog-act tagging on Dihana and Switchboard. In each case, the proposed approach outperforms the state-of-the-art. Lastly, we show that our objective can be generalized into a form that includes the standard squared-error loss, and we prove a geometric rate of convergence in that case

Download the PDF version

Download the Gzipped Postscript version