These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Tool Call Accuracy evaluates the effectiveness of a language model (LLM) in accurately identifying and invoking the necessary tools to accomplish a specified task. This metric is essential for assessing the model’s capability to select and utilize appropriate tools in a sequence that aligns with the task requirements. A higher Tool Call Accuracy indicates that the LLM effectively recognizes and employs the correct tools in the proper order, thereby enhancing task performance and reliability.
Formula:
Tool Call Accuracy = (Number of Correct Tool Calls Made by the Model) / (Total Number of Reference Tool Calls)
This formula calculates the proportion of tool calls made by the model that match the reference tool calls in both sequence and correctness.
References
About the metric
You can click on the links to see the associated metrics
Metric type(s):
Purpose(s):
Lifecycle stage(s):
Usage rights:
Target users:
Risk management stage(s):
Github stars:
- 7100
Github forks:
- 723