These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
This metric is used to assess performance on the Mathematics Aptitude Test of Heuristics (MATH) dataset.
It first canonicalizes the inputs (e.g., converting 1/2 to \\frac{1}{2}) and then computes accuracy.
About the metric
You can click on the links to see the associated metrics
Objective(s):
Purpose(s):
Lifecycle stage(s):
Target users:
