Sample Selection with Uncertainty of Losses for Learning with Noisy Labels
In this paper, we introduce a framework ARBEx, a novel attentive feature
extraction framework driven by Vision Transformer with reliability balancing to
cope against poor class distributions, bias, and uncertainty in the facial
expression learning (FEL) task. We reinforce several data pre-processing and
refinement methods along with a window-based cross-attention ViT to squeeze the
best of the data. We also employ learnable anchor points in the embedding space
with label distributions and multi-head self-attention mechanism to optimize
performance against weak predictions with reliability balancing, which is a
strategy that leverages anchor points, attention scores, and confidence values
to enhance the resilience of label predictions. To ensure correct label
classification and improve the models' discriminative power, we introduce
anchor loss, which encourages large margins between anchor points.
Additionally, the multi-head self-attention mechanism, which is also trainable,
plays an integral role in identifying accurate labels. This approach provides
critical elements for improving the reliability of predictions and has a
substantial positive effect on final prediction capabilities. Our adaptive
model can be integrated with any deep neural network to forestall challenges in
various recognition tasks. Our strategy outperforms current state-of-the-art
methodologies, according to extensive experiments conducted in a variety of
contexts.