DADA: Differentiable Automatic Data Augmentation
Data augmentation (DA) techniques aim to increase data variability, and thus
train deep networks with better generalisation. The pioneering AutoAugment
automated the search for optimal DA policies with reinforcement learning.
However, AutoAugment is extremely computationally expensive, limiting its wide
applicability. Followup works such as Population Based Augmentation (PBA) and
Fast AutoAugment improved efficiency, but their optimization speed remains a
bottleneck. In this paper, we propose Differentiable Automatic Data
Augmentation (DADA) which dramatically reduces the cost. DADA relaxes the
discrete DA policy selection to a differentiable optimization problem via
Gumbel-Softmax. In addition, we introduce an unbiased gradient estimator,
RELAX, leading to an efficient and effective one-pass optimization strategy to
learn an efficient and accurate DA policy. We conduct extensive experiments on
CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets. Furthermore, we demonstrate
the value of Auto DA in pre-training for downstream detection problems. Results
show our DADA is at least one order of magnitude faster than the
state-of-the-art while achieving very comparable accuracy. The code is
available at https://github.com/VDIGPKU/DADA.