ABSTRACT

Most existing co-segmentation algorithms mainly focus on using low-level features, failing to deal with complex scenes. In this paper, we propose a novel weakly supervised method that can learn high-level features of different images in an image group for accurate common object segmentation. Specifically, we introduce a top-down neural attention model to generate class-specific maps as aprior hint about possible foreground locations. Then, we apply a Fully Connected Conditional Random Field (FC-CRF) model for accurate boundary recovery. Our method only requires image-level classification tags as supervision, so it can significantly reduce the cost in producing training data. The comparison experiments on the public datasets iCoseg and MSRC demonstrate the superior performance of the proposed method.