Home | Legals | Sitemap | KIT

Image and Video Analysis


  • Person Identification in Multimedia Data
  • Attribute based Object and Person Recognition
  • Text-to-Video Alignment for Story Understanding
  • Knowledge Transfer and Domain Adaptation
  • Thermal to Visible Image Matching
Selected Publications
Author Title Source

Z. Al-Halah, R. Stiefelhagen

IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Beach, HI, USA, January, 2015

M. Tapaswi, M. Bäuml, R. Stiefelhagen

In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA

M. S. Sarfraz, R. Stiefelhagen

In Proceedings British Machine Vision Conference (BMVC) 2015, Swansea, UK, September 2015 (Best Industry Paper Award)

Multimedia Analysis

Videos engage multiple human senses and are a very powerful medium of conveying stories. However, automatically understanding the story they convey and making them accessible for search and summarization is a very challenging task. We focus on structured videos produced primarily for the purpose of story-telling, namely TV series and films. In this domain, we have developed considerable expertise in generating meta-data, e.g., shot and scene boundary detection, face tracking and identification.

Further, we also have work on trying to understand the underlying story behind the videos through multiple facets. We propose to analyze yet unexplored sources of natural language text to better understand videos. We focus on two different forms: (i) plot synopses, short descriptions of the story conveyed in the video often collected through crowd sourcing; and (ii) novels from which films and TV series are increasingly being adapted. Visualization often plays an important role in understanding, and towards this goal, we also propose an automatic method to display character interactions and present the story of the
video at a glimpse.

Knowledge Transfer and Semantic Understanding

Knowledge transfer is the ability to leverage experiences and skills obtained previously via a training process to a new task or domain. This feature is an important characteristic of the learning process of human beings. We do not learn tasks in isolation, rather we try to project the experience we gather through out our lives to facilitate the learning of the new task. The ability to transfer gives us the advantage of an initial high performance and to learn faster when handling a new task while using only few trials (e.g. one- and zero-shot learning). In our work we tackle questions like: What, How and When to transfer?

Furthermore, we leverage mid-level semantics like visual attributes for transfer learning in domains of object, action and person recognition.