We address the problem of person identification in TV series. We propose a unified learning framework for multi-class classification which incorporates labeled and unlabeled data, and constraints between pairs of features in the training. We apply the framework to train multinomial logistic regression classifiers for multi-class face recognition. The method is completely automatic, as the labeled data is obtained by tagging speaking faces using subtitles and fan transcripts of the videos. We demonstrate our approach on six episodes each of two diverse TV series and achieve state-of-the-art performance.
The following package contains the face tracks, features, speaker assignments and video events as used in this paper. This package does not contain person tracks. You can obtain person tracks for BBT1-6 from here.
CVPR2013_PersonID_data_v1.1.tar.bz2 (1.1 GB)