Posts

BARACK: partially supervised group robustness with guarantees

Background: Neural networks fail to perform well on certain groups of the data. The group information may be expensive to obtain.  Previous work: improve worst-group performance even when group labels are unavailable for robustness and fairness Problem: improve group robustness when only some group labels are available  Methods: a two-step framework to utilize the partial labels for training data and then use the predicted group labels in a robust optimization objective Keywords:  DRO - Distributionally Robust Optimization GDRO - Group Distributionally Robust Optimization

Reading CutPaste

 CutPaste: a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Dataset: MVTec anomaly detection dataset Aim: detect various types of real-world defects [ Code ] Paper Source:  https://openaccess.thecvf.com/content/CVPR2021/papers/Li_CutPaste_Self-Supervised_Learning_for_Anomaly_Detection_and_Localization_CVPR_2021_paper.pdf

Reading Deep One-class Classification

 Deep One-class Classification: a two-stage framework for deep one-class classification. 1) learn self-supervised representations from one-class data 2) build one-class classifiers on learned representations Self-supervised representation algorithms, the connections to existing one-class classification methods, SOTA contrastive representation learning for one-class classification A: the stochastic data augmentation process, including resize and crop, horizontal flip, color jittering, gray-scale and gaussian blur. Feature extractor f, loss L, projection head g. network structure as gof, g is the projection head used to compute proxy losses and f outputs representations for the downstream task. One way of representation learning is to learn by discriminating augmentations applied to the data. Although not trained to do so, the likelihood of learned rotation classifiers is shown to well approximate the normality score and has been used for one-class classification. A plausible ex...

OOD-related papers

Deep One-Class Classification, ICLR 2021 [ Paper ] [ Code ] Google AI [ Blog ] [ Code ] Learning from Failure: Training Debiased Classifier from Biased Classifier [ Paper ] [ Code ] NeurIPS 2020 Learning Debiased Representation via Disentangled Feature Augmentation [ Paper ] [ Code ] NeurIPS 2021 Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization [ Paper ] Explainable Deep One-class Classification [ Paper ] GAN Ensemble for Anomaly Detection [ Paper ] Neural Transformation Learning for Deep Anomaly Detection Beyond Images [ Paper ] Learning Semantic Context from Normal Samples for Unsupervised Anomaly Detection Dual Compositional Learning in Interactive Image Retrieval [*] SSL  Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity [ Paper ] [ Code ] NeurIPS 2021 Unsupervised Learning of Discriminative Attributes and Visual Representations [ Paper ] CVPR2016 Common Visual Pattern Dis...

Reading Very Deep VAE

VDVAE:  a hierarchical VAE, generate samples quickly and outperform the PixelCNN in log-likelihood on all the natural image benchmarks. In theory, VAEs can actually represent autoregressive models. VAEs can learn first generate global features at low resolution, then fill in local details in parallel at higher resolutions. Many types of generative models have flourished in recent years, including likelihood-based generative models, which include autoregressive models, VAEs, and invertible flows. Their objective, the negative log-likelihood, is equivalent to the KL divergence between the data distribution and the model distribution.  1) provide theoretical justification for why greater depth could improve VAE performance 2) introduce an architecture capable of scaling past 70 layers 3) verify that depth, independent of model capacity, improves log-likelihood, and allows VAEs to outperform the PixelCNN on all benchmarks 4) uses fewer parameters, generates sam...

Reading TransFG

  TransFG:       1)  verify the effectiveness of vision transformer on fine-grained visual classification which offers an alternative to the dominating CNN backbone with RPN model design     2) naturally focuses on the most discriminative regions of the objects and achieve SOTA performance      3) visualization helps show the ability of capturing discriminative image regions Methods:     1) vision transformer as feature extractor          image sequentialization: first preprocess the input image into a sequence of flattened patches, generating overlapping patches with sliding window     2) TransFG architecture, propose the Part Selection Module (PSM) and apply contrastive feature learning to enlarge the distance of representations between similar sub-categories     3) contrastive feature learning, minimizes the similarity of classification tokens corresponding to diffe...