Section 1 - Risk identification and evaluation
As context for this question and our broader approach to contributing to the efforts of the G7 and OECD on the voluntary reporting framework:
- Microsoft is committed to helping advance shared expectations for transparency among developers and deployers of advanced AI systems, including by fulfilling our voluntary commitments to submit this Report. Through our experience with various frameworks, we have identified consistent themes and significant overlap in core expectations, such as risk assessment, risk mitigation, incident reporting, and governance. This Report helpfully enables us to provide an overview of our practices organized by key focus areas.
- Microsoft is committed to helping advance shared expectations for transparency among developers and deployers of advanced AI systems, including by fulfilling our voluntary commitments to submit this Report. Through our experience with various frameworks, we have identified consistent themes and significant overlap in core expectations, such as risk assessment, risk mitigation, incident reporting, and governance. This Report helpfully enables us to provide an overview of our practices organized by key focus areas.
- We’re at an inflection point for the adoption of AI. As AI contributes to opportunities for economic growth and scientific advancement around the world, we’re seeing new regulatory efforts and laws emerge in parallel. Making the most of AI’s potential will require broad adoption—enabled by putting in place the infrastructure and skilling programs necessary for workers and companies to flourish as well as advancing the trust that underpins people and organizations actually using new technology to enhance their lives and serve their goals. That’s where AI governance has a critical role.
- We recognize that good AI governance will result from an iterative process, and as AI technology continues to develop and evolve rapidly, we remain committed to building tools, processes, and practices that allow us to adapt AI governance at the speed of AI innovation. We also remain committed to advancing our understanding of AI risks and effective mitigations. We invite feedback from the AI ecosystem and policymakers to help inform our future efforts. We look forward to continuing to engage in dialogue related to advancing trustworthy AI, and to continuing to share our learnings with stakeholders and the broader public as our program grows and evolves.
Steps to define and/or classify risks are part of the comprehensive AI governance program Microsoft has put in place to map, measure, and manage risks. We have developed and regularly evolve a risk taxonomy to apply as applicable across technology scenarios and leverage for governance.
Under our comprehensive program, AI models and systems are subject to relevant evaluation, with mitigations then applied to bring overall risk to an appropriate level. Microsoft’s Responsible AI Standard and integrated set of policies and tools detail requirements for that process across technology scenarios. In some cases, we have also made available further details regarding policies for particular scenarios.
In the context of the development and deployment of highly capable AI models, within our Frontier Governance Framework, we have defined risks that warrant additional governance steps. In particular, we have defined “tracked high-risk capabilities,” which include:
- Chemical, biological, radiological, and nuclear (CBRN) weapons. A model’s ability to provide significant capability uplift to an actor seeking to develop and deploy a chemical, biological, radiological, or nuclear weapon.
- Offensive cyberoperations. A model’s ability to provide significant capability uplift to an actor seeking to carry out highly disruptive or destructive cyberattacks, including on critical infrastructure.
- Advanced autonomy. A model’s ability to complete expert-level tasks autonomously, including AI research and development.
As the Frontier Governance Framework further details, models assessed as posing low or medium risk in relation to tracked high-risk capabilities may be deployed with appropriate safeguards. Models assessed as having high or critical risk are subject to further review and safety and security mitigations prior to deployment. If, during the implementation of our Frontier Governance Framework, we identify a risk that we cannot sufficiently mitigate, then we will pause development and deployment until the point at which mitigation practices evolve to meet the risk.
In the context of the development and deployment of advanced AI systems, we have defined areas of policy and established governance tools to address various risk scenarios, including: “restricted uses,” “sensitive uses,” and “unsupported uses.”
- Restricted uses are subject to specific restrictions, typically on AI development or deployment. They are defined by our Office of Responsible AI and updated periodically.
- Sensitive uses involve scenarios that could have significant impacts on individuals or society, such as those affecting life opportunities, physical safety, or human rights. These uses require notification to our Office of Responsible AI and additional governance steps.
- Unsupported uses refer to reasonably foreseeable uses for which the AI system was not designed or evaluated or that we recommend customers avoid. As part of our Microsoft Responsible AI Impact Assessment process, we provide guidance to teams to think through unsupported uses.
This layered approach ensures that highly capable AI models and AI systems are developed and deployed responsibly, with ongoing efforts to map, measure, manage, and govern risks. By categorising AI technologies and use scenarios and implementing robust governance processes, Microsoft aims to safeguard against risks while promoting the effective use of AI technologies.
At Microsoft, tactics we use to identify and prioritize AI risks include threat modeling, responsible AI impact assessments, customer feedback, incident response and learning programs, external research, and AI red teaming. These exercises inform decisions about planning, mitigations, and the appropriateness of deploying an AI model or application for a given context. Equally important is our ability to remain flexible and responsive to new or previously unforeseen risks that arise at any stage of development or deployment, including post-deployment.
Red teaming in particular has become an industry best practice to identify potential risks by simulating adversarial user behavior. For pre-deployment red teaming of our highest risk AI systems and models, we leverage the expertise of Microsoft’s AI Red Team (AIRT), a centralized team of professional red teamers that operates independently of product teams. Guided by tools and resources developed by expert red teamers, product teams across Microsoft also perform pre-deployment red teaming of their AI systems and models. Risks that are identified during red teaming inform how we prioritize and design measurement and mitigation tasks.
A hallmark feature of Microsoft’s AI governance program is the intentional collaborations we nurture between engineering, research, and policy. This collaboration is particularly important to quickly advance the science of AI measurement and evaluation. Our ability to develop effective and valid risk measurement capabilities that move at the speed of innovation became increasingly more evident in 2024 as AI capabilities and the creative ways they are used continued to grow more complex.
AI risk measurement helps us to prioritize mitigations and assess their efficacy. For example, we seek to measure our AI applications' abilities to generate certain types of content and the efficacy of our mitigations in preventing that behavior. In addition to regularly updating our measurement methods, we also share resources and tools that support the measurement of risks and risk mitigations with our customers.
We continue to leverage the power of generative AI models to scale our measurement practices. Our automated measurement pipelines involve three main components. The first component is the AI system or model that is being evaluated. The second component is an AI model, usually an LLM, or in some cases a multimodal model, that is instructed to interact with the first component by simulating adversarial user behavior. The interaction between these first two components generates simulated interactions that make up test sets. The third component is also an AI model that serves as a judge, assessing and annotating each simulated interaction in the test sets based on instructions developed by human experts. The accuracy of the AI annotations is compared against human annotations, which informs how the instructions provided to the AI model need to be adjusted. Finally, the annotated test sets are used to calculate metrics about the risks, which inform downstream mitigation tasks.
In 2024, we made significant improvements to our measurement pipelines with the primary goal of expanding risk coverage across different modalities and risk types, enhancing the reliability of metrics generated, and leveraging new approaches to expose safety vulnerabilities. We expanded our measurement pipeline to cover two new risk categories: the generation of election-critical information and reproduction of protected materials. Our broader approach to mapping, measuring, and managing AI-related risks for 2024 elections is covered in Section 5D.
Our testing coverage for protected materials included content such as song lyrics, news, recipes, and code from public, licensed GitHub repositories. We also expanded our ability to measure an AI system’s ability to generate sexual, violent, and self-harm content and content related to hate and unfairness across both image generation and image understanding modalities. Furthermore, with increased support for audio modalities in the latest releases of generative AI models, we expanded measurement support for audio interactions by adding a transcription layer and running the text output through our measurement pipelines.
To improve the reliability of our metrics, we leveraged several prompt engineering techniques to optimize the performance of the annotation component of our measurement pipeline. To better measure safety vulnerabilities, we applied adversarial fine-tuning to components of our measurement pipeline to generate prompts that are more effective at revealing potential safety vulnerabilities in the system, which in turn guides risk management.
Looking ahead, we are integrating more advanced adversarial techniques and attack strategies to systematically measure vulnerabilities that could be exploited by malicious actors. We also plan to improve our evaluators for accuracy and support granular metrics, which in turn will improve their interpretability and provide transparency through safety scorecards.
Further, we will continue to expand our testing risk coverage while continuing to refine our existing evaluations across various settings, newer models, modalities, and tools. We will also continue fostering collaborations with Microsoft Research to incorporate the latest advances in the science of AI risk evaluation into our tools and practices. This includes building measurement frameworks to better understand, interrogate, and compare measurements comprehensively through multiple lenses.
Microsoft leverages red teaming, quantitative evaluations, and external expertise to implement and improve our approach to testing. We consider red teaming as more focused on risk identification and systematic measurement as more enabling of risk evaluation. As an overarching framework for risk identification and evaluation (or “mapping and measurement”) across the AI lifecycle, including during development, we focus on: 1) manual red teaming (or “adversarial testing”), during which, depending on risk level, the product team or an independent team of human experts manually probes AI technologies and products; 2) automated red teaming, during which we use tools to build upon human-generated adversarial probes to test variations at scale; and 3) systematic automated measurement, during which we use AI tools to test for risks at scale (enabling quantitative as well as qualitative evaluation).
As one approach to systematic automated measurement (detailed in A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications), we have leveraged a framework comprised of two key components: a) a data generation component, through which real-world AI content generation is simulated through the use of templates and parameters; and b) an evaluation component, through which AI-generated content is evaluated, providing both quantitative and qualitative outputs (e.g., numerical annotations of harm severity and written snippets about annotation reasoning)). More detail on our AI risk measurement practices is also included in Section 1B.
We find value in leveraging both internal and external expertise in risk identification and evaluation. As elaborated in Section 1F, external vendors and other experts contribute to our risk identification and evaluation at different stages along the AI lifecycle and for various reasons, including where our processes and practices may benefit from additional scale or specific expertise.
Microsoft internally manually red teams a wide range of AI technologies and products, including AI models, platform services, and applications, for AI security and responsible AI risks. Manual red teaming of AI models and products not designated as high risk is generally carried out by the teams developing them and reviewed as part of our launch readiness assessment. Manual red teaming of AI models and products designated as high risk is carried out by our internal AI Red Team, which is comprised of multi-disciplinary experts and independent from any products teams. Microsoft established our AI Red Team in 2018 to identify AI security vulnerabilities, and it has since evolved to evaluate other risks associated with AI.
We’ve also developed automated red teaming tools—used by both product-based red teams and our expert AI Red Team—and made some available open access. In February 2024, we released a red teaming accelerator, Python Risk Identification Tool for generative AI (PyRIT), enabling developers to proactively identify risks in their generative AI applications. PyRIT accelerates an evaluator’s work by expanding on initial red teaming prompts and automatically scoring outputs using content filters. PyRIT has received over 2,000 starts on GitHub and been copied more than 200 times by developers for use in their own repositories, where it can be modified to fit their use cases. In April 2025, Microsoft announced PyRIT’s integration with Azure AI Foundry. Customers using this new capability in Azure AI Foundry can simulate adversarial attack techniques and generate red teaming reports that help track risk mitigation improvements throughout the AI development lifecycle.
As the science of AI evaluations is still developing, there are numerous limitations to current approaches to evaluations, impacting Microsoft's internal quantitative tests and external benchmarks. Scientifically valid approaches to generative AI evaluations are nascent and rapidly evolving; there are gaps in how to systematize, operationalize, and measure relevant theoretical constructs needed to perform reliable testing (for a link to a published paper in which this is discussed, see Measurement and Fairness (arxiv.org)).
Similarly, many existing widely used benchmarks are not robust to distribution shifts and other measures of validity over time and fail to translate well to real-world settings (for a link to a published paper in which this is discussed, see: 2404.09932 (arxiv.org)). Benchmarks are also not advancing as rapidly as AI functionality; for example, the breadth of language capabilities goes beyond the breadth of languages benchmarked.
While evaluation is critical to risk management, there are also additional steps, such as leveraging vulnerability and incident reporting, that can be leveraged across the AI lifecycle as part of a comprehensive governance approach. For the past two decades, the Secure Development Lifecycle (SDL) that Microsoft has implemented across the company has included a response phase that focuses on handling unforeseen issues and applying these learnings to future releases. We apply the infrastructure, processes, and best practices developed over the years through SDL to our AI systems.
Across Microsoft, product teams are required to have repeatable processes to collect user feedback and to triage and address issues that arise after the release of an AI system. Teams are also required to build feedback collection mechanisms within their products so users can more easily report concerns. When possible, teams employ automation to enable quick action on well-understood problems.
Initial concerns reported via the Microsoft Security Response Center’s (MSRC) researcher portal are triaged and assessed by expert teams. Microsoft employees are also provided with multiple avenues to raise concerns, including an anonymous reporting channel. If these concerns are assessed to warrant an incident response, appropriate teams are assembled and coordinated by response specialists, who manage root-cause analysis, mitigation, and communication. After the incident is mitigated, a postmortem analysis is typically conducted to distill and absorb the learnings from the event and convert them into long-term improvements in system robustness.
We incentivize the responsible disclosure of issues of concern, including risks and potential vulnerabilities and incidents, through our commitment to Coordinated Vulnerability Disclosure, bug bounty programs, which we have extended to AI productsAI products, and efforts to build community and offer public thanks through conferences like BlueHat and the Microsoft Researcher Recognition Program.
Microsoft leverages independent external expertise in conducting tests in multiple ways. In advance of the deployment of highly capable models subject to our Frontier Governance Framework, evaluations for a set of tracked high-risk capabilities (see Section 1A) involve qualified and expert external actors that meet relevant security standards, including those with domain-specific expertise, as appropriate. As our Frontier Governance Framework acknowledges, we also benefited from the advice of external experts in formulating our list of tracked high-risk capabilities for which we apply these governance steps.
More broadly, external experts contribute to our evaluation practices in important ways. We identify third-party evaluators who have special expertise in evaluating specific risks, and we engage them to help us with tailored evaluations. These engagements help us better understand how to conceptualize certain risks and may surface opportunities for us to improve our risk mitigations.
Specifically, external experts have contributed to our development of testing sets, including by writing and seeding prompts, and to our paradigms for red teaming and systematic measurement, with consultations with external subject matter experts helping refine our risk conceptualization. As highlighted in A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications, creating harm- or risk-specific measurement resources requires domain-specific expertise. They have also supported translation or localization guidance for AI evaluations.
We also have in place mechanisms to receive reports from third parties of risks and potential vulnerabilities and incidents, including via reported via the Microsoft Security Response Center’s (MSRC) researcher portal, as described in Section 1E.
We see global efforts to build consensus-based frameworks, best practices, and standards as critical to promoting the trust and coherence that underpin broad adoption of a global technology. The development of AI standards for risk assessment and evaluation is especially important to advance rapidly; specifying the types of evaluations needed, ways to substantiate reliability, and expectations for evidence will be crucial in enhancing AI assessment and evaluation science.
We have made significant contributions to and use various best practices, including international or other formal standards or industry technical specifications. For example, in the context of industry standards, best practices, and technical specification efforts, we have been developing specifications for content provenance and authentication through the Coalition for Content Provenance and Authenticity (C2PA), of which Microsoft is a founding member. We have implemented C2PA in multiple services, such as LinkedIn, where content carrying the technology is automatically labelled, and Bing Image Creator, where all images created include 'Content Credentials.’ Additionally, we are actively involved in developing tools and defining practices for AI evaluations, such as the MLCommons platform for AI risk and reliability tests and Frontier Model Forum (FMF) Issue Briefs.
Microsoft uses international standards for cybersecurity, including ISO/IEC 27001, ISO/IEC 27017, ISO/IEC 29147, and ISO/IEC 30111 for information security management systems, cloud security, and coordinated vulnerability disclosure and management. We have also led on standardization for securing AI systems with the intent of providing guidance and awareness of security risks to help organizations better protect AI systems against an evolving threat landscape.
Microsoft was a key contributor to the conception and development of ISO/IEC 42001:2023. In early 2025, we successfully achieved the ISO/IEC 42001:2023 certification for Microsoft 365 Copilot and Microsoft 365 Copilot Chat. This certification confirms that an independent third party has validated Microsoft's application of the necessary framework and capabilities to effectively manage risks and opportunities associated with the continuous development, deployment, and operation of M365 Copilot and M365 Copilot Chat.
Microsoft also actively participates in Standards Development Organizations (SDOs), including: 1) ISO/IEC JTC1 SC42 Artificial Intelligence, and 2) CEN/CENELEC JTC21 Artificial Intelligence. Our involvement in these committees is facilitated through the National Standards Body members, where we have taken on a number of leadership roles. In SC42, we have been leading efforts, sharing our experiences and knowledge in developing and implementing RAI practices, on key standards deliverables related to terminology and concepts, governance, responsible AI, risk management, data quality management, and conformity assessment.
In JTC21, we have focused on harmonized standards requested by the European Commission to support the EU AI Act. Microsoft also actively participates in the standardization efforts led by the U.S. National Institute of Standards and Technology (NIST) and contributed to the development of the NIST AI Risk Management Framework (RMF) by providing feedback on drafts.
We remain committed to advancing internationally recognized standards that help to establish consistent practices, enhance accountability, and foster trust in AI technologies.
Microsoft collaborates with industry peers, academia, civil society, and governments to develop, share, and adopt risk mitigation measures. These collaborations take many forms, including research, standards development, and open-source projects.
Through the Frontier Model Forum, for example, we have collaborated with others in industry to develop resources on safety evaluations (see: Issue Brief: Early Best Practices for Frontier AI Safety Evaluations - Frontier Model Forum) and security best practices (see: Issue Brief: Foundational Security Practices - Frontier Model Forum). Through Partnership on AI, we have also contributed to the development of guidance for safe foundation model deployment.
No answer provided


























