Description
We address the task of segmenting presentation slides, where the examined page was captured as a live photo during lectures.
Slides are important document types used as visual components accompanying presentations in a variety of fields ranging from education to business.
However, automatic analysis of presentation slides has not been researched sufficiently, and, so far, only preprocessed images of already digitalized slide documents were considered.
We aim to introduce the task of analyzing unconstrained photos of slides taken during lectures and present a novel dataset for Page Segmentation with slides captured in the Wild (WiSe). Our dataset covers pixel-wise annotations of 25 classes on 1300 pages, allowing overlapping regions (i.e., multi-class assignments).
To evaluate the performance, we define multiple benchmark metrics and baseline methods for our dataset.
We further implement two different deep neural network approaches previously used for segmenting natural images and adopt them for the task.
Our evaluation results demonstrate the effectiveness of the deep learning-based methods, surpassing the baseline methods by over 30%.
To foster further research of slide analysis in unconstrained photos, we make the WiSe dataset publicly available to the community.
If you use this dataset in your research, please cite:
Paper
WiSe - Slide Segmentation in the Wild
Monica Haurilet, Alina Roitberg, Manuel Martinez, Rainer Stiefelhagen
Winter Conference on Applications of Computer Vision (
ICDAR)
[paper]
@inproceedings{haurilet2019wise, author = {Monica Haurilet and Alina Roitberg and Manuel Martinez and Rainer Stiefelhagen}, {{WiSe - Slide Segmentation in the Wild}}, year = {2019}, booktitle = {International Conference on Document Analysis and Recognition (ICDAR)}, month = {Sep.},}