ReViT: Enhancing Vision Transformers with Attention Residual Connections for Visual Recognition
In this paper, we revisit techniques for uncertainty estimation within deep
neural networks and consolidate a suite of techniques to enhance their
reliability. Our investigation reveals that an integrated application of
diverse techniques--spanning model regularization, classifier and
optimization--substantially improves the accuracy of uncertainty predictions in
image classification tasks. The synergistic effect of these techniques
culminates in our novel SURE approach. We rigorously evaluate SURE against the
benchmark of failure prediction, a critical testbed for uncertainty estimation
efficacy. Our results showcase a consistently better performance than models
that individually deploy each technique, across various datasets and model
architectures. When applied to real-world challenges, such as data corruption,
label noise, and long-tailed class distribution, SURE exhibits remarkable
robustness, delivering results that are superior or on par with current
state-of-the-art specialized methods. Particularly on Animal-10N and Food-101N
for learning with noisy labels, SURE achieves state-of-the-art performance
without any task-specific adjustments. This work not only sets a new benchmark
for robust uncertainty estimation but also paves the way for its application in
diverse, real-world scenarios where reliability is paramount. Our code is
available at \url{https://yutingli0606.github.io/SURE/}.