The SPaSe Dataset for Slide Segmentation

Monica Haurilet, Ziad Al-Halah, Rainer Stiefelhagen


Haurilet_2019

Highlights

  • The first benchmark dataset for slide-page segmentation
  • Annotations for 2000 complex slides from Slideshare-1M [1]
  • Fine-grained pixel-wise labels
  • 14 text, 6 image-based and 4 structural classes
  • Highly overlapping segments i.e. multiple labels per pixel

Description

We introduce the first benchmark dataset for slide-page segmentation. Presentation slides are one of the most prominent document types used to exchange ideas across the web, educational institutes and businesses. This document format is marked with a complex layout which contains a rich variety of graphical (e.g. diagram, logo), textual (e.g. heading, affiliation) and structural components (e.g. enumeration, legend). This vast and popular knowledge source is still unattainable by modern machine learning techniques due to lack of annotated data. To tackle this issue, we introduce SPaSe (Slide Page Segmentation), a novel dataset containing in total dense, pixel-wise annotations of 25 classes for 2000 slides. We show that slide segmentation reveals some interesting properties that characterize this task. Unlike the common image segmentation problem, disjoint classes tend to have a high overlap of regions, thus posing this segmentation task as a multi-label problem. Furthermore, many of the frequently encountered classes in slides are location sensitive (e.g. title, footnote). Hence, we believe our dataset represents a challenging and interesting benchmark for novel segmentation models. Finally, we evaluate state-of-the-art segmentation networks on our dataset and show that they are suitable for developing deep learning models without any need of pre-training. The annotations will be released to the public to foster further research on this interesting task.

If you use this dataset in your research, please cite:

Paper

paper thumbnail SPaSe - Multi-Label Page Segmentation for Presentation Slides
Monica Haurilet, Ziad Al-Halah, Rainer Stiefelhagen
Winter Conference on Applications of Computer Vision (WACV)
[paper] [supp.]

@inproceedings{haurilet2019spase,
author = {Monica Haurilet, Ziad Al-Halah and Rainer Stiefelhagen},
title = {{SPaSe - Multi-Label Page Segmentation for Presentation Slides}},
year = {2019},
booktitle = {Winter Conference on Applications of Computer Vision},
month = {Jan.},
}

Download


Licensing


Contact

If you have any questions regarding this dataset, please contact us at: haurilet (at) kit.edu

References