The Google-512 data set was collect to learn robust color (term) models for human-computer and, especially, human-robot interaction (HRI). To this end, 512 images for each of the 11 basic English color terms (“black,” “white,” “red,” “green,” “yellow,” “blue,” “brown,” “orange,” “pink,” “purple,” and “gray”) were collected using Google's image search engine.
Since we wanted to learn color models that are suitable for HRI tasks (see ), we had to deal with the fact that the images collected from the Internet are often synthetic, artiﬁcial, or highly processed and in consequence have different characteristics that lead to color models that are not perfectly suited for application on real-world data as provided by "common" cameras. Thus, we developed a randomization-based domain adaptation strategy, which uses a probabilistic hue-saturation-lightness (pHSL) model to randomize the input data, i.e. the images, in order to obtain a more natural color distribution . In , we proposed the use of the supervised latent Dirichlet allocation instead of the probabilistic latent semantic analysis (pLSA; see ). Furthermore, we use state-of-the-art salient object detection to improve the quality of the learnt models .
In addition to the HRI scenario, we also apply the color term models to assist visually impaired and blind people .
To give a short impression of the data set (the 10th, 100th, 200th, 300th image for each color; "black," "grey," "white," "brown," "green," "blue," "yellow," "orange," "red," "purple," and "pink"):
 B. Schauerte, G. A. Fink, "Web-based Learning of Naturalized Color Models for Human-Machine Interaction".
In Proceedings of the 12th International Conference on Digital Image Computing: Techniques and Applications (DICTA), IEEE, Sydney, Australia, December 1-3, 2010.
 B. Schauerte, G. A. Fink, "Focusing Computational Visual Attention in Multi-Modal Human-Robot Interaction".
In Proceedings of the 12th International Conference on Multimodal Interfaces (ICMI), ACM, Beijing, China, November 8-12, 2010.|
 B. Schauerte, R. Stiefelhagen, "Learning Robust Color Name Models from Web Images".
In Proceedings of the 21st International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, November 11-15, 2012|
 B. Schauerte*, M. Martinez*, A. Constantinescu*, R. Stiefelhagen, "An Assistive Vision System for the Blind that Helps Find Lost Things".
In Proceedings of the 13th International Conference on Computers Helping People with Special Needs (ICCHP), Linz, Austria, July 11-13, 2012. (* equal contribution)|
You can directly access the images of the data set with your browser (follow the [direct] links) or you can download the images of each color term compressed in a .zip file. The number of each image file name ([1-512].$FILE_ENDING) represents the ranking by Google's image search engine. However, it has to be noted that the number is not identical to Google's rank, because a few images could not be accessed due to strict content protection methods or have already been removed from the Web (i.e. dead links).
The main Google-512 data set:
The Google-512 image border masks that we used for training in :
The Google-512 region-contrast saliency maps that we used for training in :