ActionFormer: Localizing Moments of Actions with Transformers
Benchmarks, such as COCO, play a crucial role in object detection. However,
existing benchmarks are insufficient in scale variation, and their protocols
are inadequate for fair comparison. In this paper, we introduce the
Universal-Scale object detection Benchmark (USB). USB has variations in object
scales and image domains by incorporating COCO with the recently proposed Waymo
Open Dataset and Manga109-s dataset. To enable fair comparison and inclusive
research, we propose training and evaluation protocols. They have multiple
divisions for training epochs and evaluation image resolutions, like weight
classes in sports, and compatibility across training protocols, like the
backward compatibility of the Universal Serial Bus. Specifically, we request
participants to report results with not only higher protocols (longer training)
but also lower protocols (shorter training). Using the proposed benchmark and
protocols, we conducted extensive experiments using 15 methods and found
weaknesses of existing COCO-biased methods. The code is available at
https://github.com/shinya7y/UniverseNet .