Lei Yang1 Ziwei Yan2 Yisheng He3 Wei Sun1 Zhenhang Huang4 Haikun Zhang4 Haibin Huang5 Haoqiang Fan1
1Megvii Research Beijing, Megvii Technology Ltd., Beijing, China
2School of Software, Beihang University, Beijing, China
3Hong Kong University of Science and Technology, Hong Kong, China
4Beijing University of Chemical Technology, Beijing, China
5Kuaishou Technology, Beijing, China
A typical scene of objects with irregular shape and similar appearance. It has many characteristics that challenge instance segmentation algorithms, including the large overlaps between bounding boxes of objects, extreme aspect ratios (bounding box of the grey mask), and large numbers of connected components in one instance (green and blue masks).
In this paper, we introduce a brand new dataset to promote the study of instance segmentation for objects with irregular shapes. Our key observation is that though irregularly shaped objects widely exist in daily life and industrial scenarios, they received little attention in the instance segmentation field due to the lack of corresponding datasets. To fill this gap, we propose iShape, an irregular shape dataset for instance segmentation. iShape contains six sub-datasets with one real and five synthetics, each represents a scene of a typical irregular shape. Unlike most existing instance segmentation datasets of regular objects, iShape has many characteristics that challenge existing instance segmentation algorithms, such as large overlaps between bounding boxes of instances, extreme aspect ratios, and large numbers of connected components per instance. We benchmark popular instance segmentation methods on iShape and find their performance drop dramatically. Hence, we propose an affinity-based instance segmentation algorithm, called ASIS, as a stronger baseline. ASIS explicitly combines perception and reasoning to solve Arbitrary Shape Instance Segmentation including irregular objects. Experimental results show that ASIS outperforms the state-of-the-art on iShape.
In this work, we present iShape, a new dataset designed for irregular Shape instance segmentation. Our dataset consists of six sub-datasets, namely iShape-Antenna, iShape-Branch, iShape-Fence, iShape-Log, iShape-Hanger, and iShape-Wire. As shown in picture below, each sub-dataset represents scenes of a typical irregular shape, for example, strip shape, hollow shape, and mesh shape.
iShape download URL:
http://39.105.21.95:9000/ishape/ishape_dataset.tar
Or from our oss URL:
https://ylshare.oss-cn-shanghai.aliyuncs.com/ishape_dataset.tar
$ md5sum ishape_dataset.tar
# 2b3bd15e6ec762bbc03dddc5e4bc24df
Dataset format: iShape provides both Cityscapes and COCO style instance segmentation annotations.
*.png
files under directory instance_map
. Similar to *_instanceIds.png
in Cityscapes dataset, those png file are Height * Width * 16bit. Each pixel value x
means that the pixel belongs to the instance ID is x
.Source code about the dataset:
build_synthetic_ishape
: Source code of building iShape synthetic data.bpycv
: Computer vision utils for open-source CG software Blender.Dataset license: Public domain (CC0)
We introduce a stronger baseline considering irregular shape in this paper, which explicitly combines perception and reasoning. Our key insight is to simulate how a person identifies an irregular object. Taking the wire shown in Figure 1 for example, one natural way is to start from a local point and gradually expand by following the wire contour and figure out the entire object. The behavior of such ``following the contour'' procedure is a process of continuous iterative reasoning based on local clues, which is similar to the recent affinity-based approaches . Under such observation, we propose a novel affinity-based instance segmentation baseline, called ASIS, which includes principles of generating effective and efficient affinity kernel based on dataset property to solve Arbitrary Shape Instance Segmentation. Experimental results show that the proposed baseline outperforms existing state-of-the-art methods by a large margin on iShape.
Overview of ASIS. In the training stage, the network learns to predict the semantic segmenation as well as the affinity map. In the inference stage, first, build graph operation transforms the predicted affinity map into a sparse undirected graph by setting pixels as nodes and the affinity between pixels as edges. Then the graph merge algorithm is applied to the graph. The algorithm will cluster the pixels to yield class-agnostic instance segmentation. Finally, the class assign module will add a category with confidence to each instance using the result of semantic segmentation.
We also benchmark existing instance segmentation algorithms on iShape and find their performance degrades significantly.
Qualitative results on iShape. We report the mask mmAP of six sub-datasets and the average of mmAP. To be fair, all methods use ResNet-50 as the backbone. ``w/o'' denotes ``without''.
Method | Antenna | Branch | Fence | Hanger | Log | Wire | Average | Config | Download | Code |
---|---|---|---|---|---|---|---|---|---|---|
SOLOv2 | 6.6 | 27.5 | 0.0 | 28.8 | 22.2 | 0.0 | 14.07 | config | model | log | Link |
PolarMask | 0.0 | 0.0 | 0.0 | 0.0 | 18.6 | 0.0 | 3.10 | config | model | log | Link |
SpatialEmbeddings | 38.3 | 0.0 | 0.0 | 49.8 | 20.9 | 0.0 | 18.17 | config | model | log | Link |
Mask RCNN | 16.9 | 4.2 | 0.0 | 22.1 | 32.6 | 0.0 | 12.63 | config | model | log | Link |
DETR | 2.1 | 2.6 | 0.0 | 32.2 | 46.2 | 0.0 | 13.85 | config | model | log | Link |
GMIS* | 67.6 | 14.9 | 30.6 | 24.8 | 63.2 | 46.1 | 41.21 | config | model | log | Link |
ASIS without OHEM | 82.1 | 17.6 | 48.0 | 40.5 | 66.4 | 66.5 | 53.51 | config | model | log | Link |
ASIS(ours) | 88.5 | 24.6 | 60.4 | 57.4 | 69.4 | 77.3 | 62.93 | config | model | log | Link |
Click to open the big image.
More qualitative results are => here
If you have any questions about iShape, feel free to submit an issue here: issues.