【Panoptic】 Domain adaptive panoptic segmentation

1. Panoptic segmentation, CVPR18

Paper
PQ matrix
1. GT영역과 Pred영역과 비교해서 0.5 iou를 넘는 매칭 탐지. (GT영역은 1개 or 0개의 Pred영역화 매칭된다. → 2개는 수학적으로 불가능하다.)
2. 클래스 하나하나 PQ를 계산한다. 예시 이미지에서 person만을 고려해보자. GT영역 or Pred영역이 person인 것에 대해 매칭을 수행한다.
3. 매칭된 GT영역을 TP, FN, FP로 구분하고 다음과 깉이 해석할 수 있다. (1) TP: 매칭이 잘 된 것들. (2) FN: 갈색 영역과 같이 class는 person인데, 매칭되는 Pred 영역의 class or instance id가 틀린것 (3) FP: 매칭된 GT영역이 person 조차 아닌것.
Post-processing: To obtain outputs for Panoptic segmentation
- Instance segmentation
  1. Non-overlapping predictions := Non-maximum suppression (NMS)
  2. Thresholding (removing instances with low scores)
  3. Iterating over sorted instances, starting from the most confident.
  4. Judging a fraction of the segment remains.
- Panoptic segmentation: Get instance & semantic segmentation output.
  1. Instance segmentation → Non=overlapping predictions.
  2. In favor of the thing class, two outputs are combined.
  3. Removing any stuff labeled ‘other’ or under a given area threshold.

Paper
Current panoptic segmentation methods use separate and dissimilar networks for instance and semantic segmentation.
We aim to unify them and design a single, fast, accurate baseline network, the minimally extended version of Mask-R-CNN with FPN.
The architecture is so straightforward that they try to explain the setting, such as loss balancing, learning rate, and data augmentation.
The simple post-processing is followed in the above PS.
They expose the performance in the sense of AP, mIoU, and PQ.

Paper
Problem: CVPR18 methods have two separate branches designed for semantic and instance segmentation.
Solution: our model exploits a single network as backbone.
New points
- semantic segmentation head: a deformable convolution-based
- panoptic head: Refer Sec 3.1
- Unknown prediction: Refer Sec 3.1. UPSNet allows UPSNet to classify a pixel as the unknown class. In evaluation, any pixel belonging to unknown is ignored.

Paper
The first domain adaptive panoptic segmentation network
- Method1: Inter-style consistency(regularization) (ISR)
- Method1: Inter-task regularization (ITR))
- ISR is “consistency training” from Semi-supervised learning (semi segment CPS, Pseudo)
- ITR is the complementary prediction between semantic and instance segmentation.
  - In terms of Pseudo label quality, stuff (semantic > instance) and things (semantic < instance) → Pseudo label rectifying
Experiments show the comparisons for semantic segmentation models, instance segmentation models, and panoptic segmentation models
Implementation details
- Model: Panoptic segmentation model (two separate inferences & post-processing for merging)
- Multi-task self-training (MTST): The training process is too complex. For example, semantic Training → instance Training → semantic Training → instance Training → panoptic evaluation
WARNING: Figure 1,2 do not mean the whole training process. So, It can be confusing.