These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Agentic Benchmark for CRM
The Agentic Benchmark for CRM is a benchmarking framework developed by Salesforce to evaluate the performance of AI agents and models in enterprise customer relationship management (CRM) use cases using metrics such as accuracy, cost, speed, trust and safety, and sustainability. The benchmark is designed to assess the readiness of AI systems for enterprise workflows by measuring how well AI agents perform tasks grounded in real CRM environments.
The framework evaluates AI models across use cases drawn from CRM domains including sales, service, and field service operations. These tasks are based on realistic enterprise workflows and datasets that reflect how organisations use CRM systems in practice. The evaluation combines automated metrics with human assessments conducted by Salesforce employees and customers to ensure that results reflect real-world expectations and operational requirements.
The benchmark measures model performance across five core dimensions: accuracy, cost, speed, trust and safety, and sustainability.
Accuracy evaluates factors such as factuality, instruction following, conciseness, and completeness in generated responses. Cost captures the computational and operational expense associated with using a model. Speed measures the time required to generate responses in enterprise workflows. Trust and safety evaluates characteristics such as privacy, safety, and truthfulness. Sustainability assesses the environmental impact associated with the computational resources required by different models.
Benchmark results can be explored through a dashboard that allows users to filter models and results by CRM domain, use case type, model provider, and model size. This enables organisations to compare different AI models and agents for specific enterprise CRM tasks and determine which systems are most suitable for operational deployment. By providing a structured evaluation framework grounded in enterprise CRM workflows, the Agentic Benchmark for CRM supports organisations in assessing the performance and readiness of AI agents for use in business environments.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Tool type(s):
Objective(s):
Purpose(s):
Lifecycle stage(s):
Type of approach:
Maturity:
Usage rights:
Target groups:
Target users:
Stakeholder group:
Geographical scope:
People involved:
Risk management stage(s):
Technology platforms:
Tags:
- Accuracy and performance
- readiness
- ai evaluation
- customer relationship management
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case



























