SpiderMesh: Spatial-aware Demand-guided Recursive Meshing for RGB-T Semantic Segmentation
Numerical validation is at the core of machine learning research as it allows
to assess the actual impact of new methods, and to confirm the agreement
between theory and practice. Yet, the rapid development of the field poses
several challenges: researchers are confronted with a profusion of methods to
compare, limited transparency and consensus on best practices, as well as
tedious re-implementation work. As a result, validation is often very partial,
which can lead to wrong conclusions that slow down the progress of research. We
propose Benchopt, a collaborative framework to automate, reproduce and publish
optimization benchmarks in machine learning across programming languages and
hardware architectures. Benchopt simplifies benchmarking for the community by
providing an off-the-shelf tool for running, sharing and extending experiments.
To demonstrate its broad usability, we showcase benchmarks on three standard
learning tasks: $\ell_2$-regularized logistic regression, Lasso, and ResNet18
training for image classification. These benchmarks highlight key practical
findings that give a more nuanced view of the state-of-the-art for these
problems, showing that for practical evaluation, the devil is in the details.
We hope that Benchopt will foster collaborative work in the community hence
improving the reproducibility of research findings.