Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s)

Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, and Kristen Grauman

[ PDF: arXiv Version ]

Abstract

We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems. Specifically, we distinguish between images which lead multiple annotators to segment different foreground objects (ambiguous) versus minor inter-annotator differences of the same object. Taking images from eight widely used datasets, we crowdsource labeling the images as “ambiguous” or “not ambiguous” to segment in order to construct a new dataset we call STATIC. Using STATIC, we develop a system that automatically predicts which images are ambiguous. Experiments demonstrate the advantage of our prediction system over existing saliency-based methods on images from vision benchmarks and images taken by blind people who are trying to recognize objects in their environment. Finally, we introduce a crowdsourcing system to achieve cost savings for collecting the diversity of all valid “ground truth” foreground object segmentations by collecting extra segmentations only when ambiguity is expected. Experiments show our system eliminates up to 47% of human effort compared to existing crowdsourcing methods with no loss in capturing the diversity of ground truths.

Publication

D. Gurari, K. He, B. Xiong, J. Zhang, M. Sameki, S. D. Jain, S. Sclaroff, M. Betke, and K. Grauman. “Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s).” International Journal of Computer Vision (IJCV), accepted January 2018.

STATIC Built from 7 Vision Datasets

Training and Testing Data

Caffe model: top-performing CNN-FT model and prototxt file

STATIC Built from VizWiz

Training and Testing Data

Caffe model: top-performing CNN-FT model and prototxt file

Acknowledgements

The authors gratefully acknowledge funding from the Office of Naval Research (ONR YIP N00014-12-1-0754) and National Science Foundation (IIS-1421943) and thank the anonymous crowd workers for participating in our experiments.

Contact

For questions and/or comments, feel free to contact:


Danna Gurari
danna.gurari@ischool.utexas.edu