Urban scene segmentation

1. Cars Can’t Fly up in the Sky: via Hight-driven Attention

  • Motivation
    1. The pixel-wise class distributions are significantly different from each other among horizontally segmented sections. Thus, capturing the height-wise contextual information should be weighted during pixel-level classification.
    2. Most semantic segmentation networks do not reflect unique attributes such as perspective geometry and positional patterns.
  • Method
    • Hight-driven attention networks (HANet) is an add-on module that is easy to attach and cost-effective.
    • As illustrated in Tab. 1 and Fig. 1, the uncertainty (entropy) is reduced if we divide an image into several parts horizontally.


2. Standardized Max Logits for Identifying Unexpected Road Obstacles

  • Motivation
    • [problem1] Existing approaches need external datasets or additional training. → [solution1] One possible alternative is to use max logit (i.e. maximum values among classes before the final softmax layer.) → [problem2] the distribution of max logit of each predicted class is different from each other. → [solution2] standardizing the max logit.
    • High prediction scores (e.g. softmax probability or max-logit) indicate low anomaly (unexpected object) scores and vice versa.
  • Methods
    1. Standardized max logits
    2. Iterative boundary suppression: replacing the high anomaly scores of boundary regions with low anomaly scores of non-boundary pixels.
    3. Dilated smoothing: both boundary and non-boundary regions are smoothed.


Test-time Adaptation in 2022

1. SHOT: Do we really need to access the source data? ICML, 2020

  • Summary: Source free (Offline) / Entropy minimization + Pseudo labeling
  • Methods
    1. Freeze the classifier and optimize the target-specific feature extractor (backbone).
    2. Source model Generation: cross-entropy loss + label smoothing technique.
    3. Target training
      1. information maximization (LM loss) = Entropy loss + diversity-promoting loss
      2. Self-supervised pseudo labeling = Prototype distance pseudo label
  • Notes: Classification / Complex wrighting / simple methods / large experiments / not good figure

2. AdaContrast: Contrastive Test-time Adaptation. CVPR, 2022

  • Key methods
    1. Memory bank (queue) of length M: Storing (1) pred-outputs for contrastive learning and (2) prob-outputs for pseudo labels
    2. Loss1: Generating pseudo label
      • Bank [ pred-output of length M + current-batch pred ]
      • According to current-batch pred, Find N neighbors with nearest neighbor algorithm.
      • Average of N neighbors = soft voted pseudo label
      • pseudo label →cross-entropy loss
    3. Loss2: Contrastive learning
      • Key: Bank [ prob-output of length M ]
      • Query: current-batch prob
      • InfoNCE → only considers negative prob in the Bank
    4. Loss3: Additionally, regulation loss = diversity regularization loss.


3. When does TTT fail or thrive? NIPS, 2021

  • Methods
    1. Distribution alignment
      • TTT can lead to failures. This is because of the unconstrained update.
      • (method1) offline feature summarization: Storing the mean and covariance matrix. (no agnostic pre-train model)
      • (method2) online moment matching: Distribution alignment Loss → BUT, (problem1) this strategy has a problem with a large number of classes.
      • (solution1) (method3) batch-queue decoupling: maintaining large encoded features in a mini-batch manner. (i.e., Global view alignment, not local (one class) view alignment)
    2. Contrastive self-sup learning
      • It is applied in both training and testing.
  • Notes: Classification / Good distribution alignment for specifically TTA


4. Continual Test-Time Domain Adaptation. CVPR, 2022

  • Motivation
    1. Continually changing environments where the target domain distribution can change over time. i.e., the distribution shift over time.
  • Method
    • Augmentation-averaged predictions: Consistency training
    • Stochastic Restoration: Preserve source knowledge in the long term. Preventing strong domain shift resulting in catastrophic failure.
  • Notes: The table containing experiment results looks brilliant.


5. Parameter-free Online Test-time Adaptation

  • TTA sometimes fails catastrophically. Instead of adapting the parameters of a pre-trained model, they only adapt its output by finding the latent assignments that optimize a manifold-regularized likelihood of the data.
  • A correction of the output probabilities is more reliable and practical than NAMs (network adaptation methods).
  • The proposed method is called Laplacian Adjusted Maximum-likelihood Estimation (LAME). This could be viewed as a graph clustering of the batch data, penalized by a KL term discouraging substantial deviations from the source model predictions.
  • The written method is hard to understand. So, I need to figure it out by the code.
    • It seems that offline data-access is needed (to find a latent assignment vector z in the paper).
    • It seems that k nearest neighbors algorithm is used in the code.
  • Notes: Output optimization, No parameters optimization, No concern about catastrophic failure.

6. Sketch3T: TTT for Zero-Shot SBIT

  • Consistency tarining
  • Meta learning

7. Ev-TTA: TTA for Event-Based Object Recognition

  • Consistency tarining

