Microsoft Copilot Exposes Private GitHub Data

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

An Israeli cybersecurity firm discovered that Microsoft's Copilot AI tool can access historical data from private GitHub repositories, exposing sensitive information and intellectual property. Over 16,000 organizations, including Microsoft, AWS, and Google, are affected, raising concerns over data security and privacy.[AI generated]

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (Microsoft Copilot) that accesses and exposes sensitive data due to its reliance on cached public GitHub repositories. This has directly led to a data breach risk affecting thousands of organizations, including major companies. The harm includes potential violations of intellectual property rights and privacy, which fits the definition of an AI Incident as the AI system's use has directly led to harm or risk of harm. The issue is not merely potential but ongoing, as Copilot can still access sensitive data despite mitigation efforts, thus constituting an AI Incident rather than a hazard or complementary information.[AI generated]
AI principles
Privacy & data governanceRobustness & digital securityAccountabilityTransparency & explainabilityRespect of human rightsSafety

Industries
IT infrastructure and hostingDigital security

Affected stakeholders
Business

Harm types
Human or fundamental rightsEconomic/PropertyReputational

Severity
AI incident

Business function:
Research and developmentICT management and information security

AI system task:
Content generation

In other databases

Articles about this incident or hazard

Thumbnail Image

微软Copilot存重大数据泄露隐患,超2万GitHub仓库敏感信息面临风险

2025-02-26
tech.ifeng.com
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Microsoft Copilot) that accesses and exposes sensitive data due to its reliance on cached public GitHub repositories. This has directly led to a data breach risk affecting thousands of organizations, including major companies. The harm includes potential violations of intellectual property rights and privacy, which fits the definition of an AI Incident as the AI system's use has directly led to harm or risk of harm. The issue is not merely potential but ongoing, as Copilot can still access sensitive data despite mitigation efforts, thus constituting an AI Incident rather than a hazard or complementary information.
Thumbnail Image

GitHub私有库数据泄露:超1.6万家机构受影响

2025-02-27
news.zol.com.cn
Why's our monitor labelling this an incident or hazard?
The AI system involved is Microsoft's Copilot, an AI-powered code assistant that uses data from GitHub repositories to generate code suggestions. The incident stems from the use and data retention practices of this AI system, which has indirectly led to a violation of intellectual property rights and potential privacy harms by exposing private data through AI outputs. Therefore, this qualifies as an AI Incident due to realized harm linked to the AI system's use and data handling.
Thumbnail Image

Thousands of exposed GitHub repos, now private, can still be accessed through Copilot | TechCrunch

2025-02-26
TechCrunch
Why's our monitor labelling this an incident or hazard?
The event involves an AI system, Microsoft Copilot, which uses generative AI capabilities to provide code suggestions and completions. The AI system's training or caching mechanism has led to the unintended exposure of sensitive data that was once public but later made private. This exposure has directly led to harm in the form of unauthorized access to confidential corporate data and intellectual property, fulfilling the criteria for an AI Incident under violations of intellectual property rights and harm to organizations. The issue is not merely potential but ongoing, as data remains accessible through Copilot despite repositories being private, indicating realized harm rather than just a hazard or complementary information.
Thumbnail Image

Thousands of exposed GitHub repos, now private, can still be accessed through Copilot - RocketNews

2025-02-26
RocketNews
Why's our monitor labelling this an incident or hazard?
The AI system involved is Microsoft's Copilot, which uses generative AI to provide code suggestions based on training data that includes publicly available GitHub repositories. The incident arises from the AI system's use of cached data that includes repositories that were briefly public but are now private, leading to unauthorized exposure of sensitive information. This exposure directly harms affected organizations by leaking intellectual property and confidential data, fulfilling the criteria for an AI Incident under violations of intellectual property rights and harm to property or communities. The harm is realized, not just potential, as the data can be accessed through Copilot outputs.
Thumbnail Image

Microsoft Copilot flaw exposes thousands of private GitHub repositories | Ctech

2025-02-26
ctech
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is an AI-powered tool that uses AI to assist with coding and development tasks. The flaw in its caching mechanism caused private data to be exposed, which constitutes a violation of intellectual property rights and confidentiality. Since the AI system's malfunction directly led to the exposure of sensitive data, this qualifies as an AI Incident under the framework, specifically under violations of intellectual property rights and harm to property.
Thumbnail Image

Thousands of exposed GitHub repositories, now private, can still be accessed through Copilot | TechCrunch

2025-02-26
TechCrunch
Why's our monitor labelling this an incident or hazard?
The AI system involved is Microsoft Copilot, a generative AI tool that uses cached data from Bing's indexing of GitHub repositories. The event involves the use and malfunction of this AI system, as it continues to expose sensitive data that should no longer be accessible. The harm includes exposure of intellectual property, sensitive corporate data, and security credentials, which are violations of intellectual property rights and security obligations. The harm is realized and ongoing, as the data can be retrieved by anyone using Copilot. This fits the definition of an AI Incident because the AI system's use and malfunction have directly led to harm to property and organizations. The event is not merely a potential risk or a complementary update but a concrete incident of harm caused by AI.
Thumbnail Image

Lasso Uncovers Sensitive Private GitHub Repositories from Fortune 500 Companies found Exposed in Microsoft Copilot via Bing Cache - Global Security Mag Online

2025-02-26
Global Security Mag Online
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Microsoft Copilot) that used private code repositories as training or reference data, which were exposed via Bing's cache, leading to unauthorized access and exposure of sensitive corporate information. This directly caused harm by violating intellectual property rights and potentially enabling unauthorized access to enterprise environments. The involvement of the AI system in the development and use phases, and the resulting breach of confidentiality and security, meet the criteria for an AI Incident. The event is not merely a potential risk or a complementary update but a realized harm linked to AI system use.
Thumbnail Image

Security researchers have big warning for developers on Microsoft Copilot - The Times of India

2025-02-27
The Times of India
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is an AI system that generates code suggestions by accessing data from GitHub repositories. The incident involves the AI system's use leading to unauthorized access and exposure of private and deleted repository data, including sensitive corporate information and access keys. This exposure is a direct harm related to intellectual property rights and data confidentiality, fulfilling the criteria for an AI Incident. The involvement of the AI system in retrieving and exposing this data is explicit and central to the harm. Hence, the event is classified as an AI Incident.
Thumbnail Image

Thousands of GitHub repositories exposed via Microsoft Copilot

2025-02-27
TechRadar
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is a generative AI system that uses cached data from Bing to provide code suggestions. The incident involves the AI system retrieving private repositories that were once public and cached, exposing sensitive information such as credentials and company secrets. This exposure constitutes a violation of privacy and security, which falls under harm to property and potentially harm to organizations. The AI system's use directly led to this harm, fulfilling the criteria for an AI Incident. The fact that Microsoft acknowledged the issue but considered it low severity does not negate the realized harm. Hence, this event is classified as an AI Incident.
Thumbnail Image

Copilot exposes private GitHub pages, some removed by Microsoft

2025-02-27
Ars Technica
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Microsoft Copilot) whose use and malfunction (continued access to cached private data) directly leads to harm, specifically violations of privacy and intellectual property rights, which are breaches of fundamental and legal protections. The exposure of sensitive credentials and proprietary tools constitutes realized harm to individuals and organizations. Therefore, this qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

Security researchers have big warning for developers on Microsoft Copilot - ET CISO

2025-02-28
ETCISO.in
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is an AI system that generates code suggestions based on data it has access to. The vulnerability described involves the AI system retrieving cached content that should be private or deleted, resulting in unauthorized access to sensitive data. This constitutes a violation of intellectual property rights and confidentiality, which falls under harm category (c) - violations of human rights or breach of obligations protecting intellectual property rights. Since the AI system's use directly led to this harm, this event qualifies as an AI Incident.
Thumbnail Image

数千个已私有化的 GitHub 代码库仍可通过 Copilot 访问

2025-02-28
net.zhiding.cn
Why's our monitor labelling this an incident or hazard?
The event involves an AI system, Microsoft Copilot, which uses indexed and cached data from GitHub repositories to generate code suggestions. The incident arises from the AI system's use of data that was once public but later privatized, leading to unauthorized exposure of sensitive and proprietary information. This exposure constitutes a breach of intellectual property rights and confidentiality, fulfilling the criteria for harm under the AI Incident definition. The harm is indirect, as the AI system's training and retrieval processes enable access to data that should no longer be accessible, causing significant harm to organizations' property and rights. The event is not merely a potential risk but a realized harm, as data is actively accessible through Copilot. Hence, it qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

Chatbots are surfacing data from GitHub repositories that are set to private

2025-02-28
TechSpot
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (Copilot and ChatGPT) that have been trained or operate in a way that allows them to surface data from private repositories, which is a direct misuse or malfunction of the AI system's data handling. The exposure of private and sensitive data through AI-generated outputs has already occurred, causing harm to organizations by compromising their confidential information and security credentials. This fits the definition of an AI Incident because the AI system's use and malfunction have directly led to violations of intellectual property rights and potential harm to property and organizations. The breach is not merely a potential risk but a realized harm, as data was accessed and is still retained within the AI model, posing ongoing risks.
Thumbnail Image

Generatieve AI gebruiken binnen de Radboud Universiteit | Radboud Universiteit

2025-02-24
Radboud Universiteit
Why's our monitor labelling this an incident or hazard?
The content is primarily about institutional guidelines, risk management, and policy planning for AI use, which fits the definition of Complementary Information. There is no indication of an AI Incident (no harm has occurred) or AI Hazard (no specific plausible future harm event is described). The article provides context and updates on how the university is managing AI-related risks and compliance with the AI Act, which is a governance and societal response to AI developments.
Thumbnail Image

Gevoelige GitHub-data duikt op in Microsoft Copilot-AI - TechPulse

2025-02-26
TechPulse
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Microsoft Copilot) that uses cached data from Bing to retrieve sensitive information that should have been private or deleted. The AI system's use and malfunction (retaining and exposing sensitive data) have directly led to harm in the form of violations of intellectual property rights and exposure of confidential data, affecting thousands of organizations. The harm is realized, not just potential, as the data is accessible via the AI chatbot. This fits the definition of an AI Incident because the AI system's use has directly led to a breach of obligations under applicable law protecting intellectual property and confidentiality, causing harm to property and organizations.
Thumbnail Image

Amsterdam stopt Microsoft Copilot-proef vanwege privacyzorgen - Computable.nl

2025-02-25
Computable
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is an AI system used for generating content and processing sensitive data. The pilot was stopped because the system does not currently meet privacy requirements, indicating a risk of violation of data protection laws (a form of legal rights). Since no realized harm or incident is reported, but there is a credible risk of privacy violations if continued, this qualifies as an AI Hazard. The event is about plausible future harm due to non-compliance and privacy risks, not an actual incident or harm that has occurred yet.
Thumbnail Image

Amsterdam stopt pilotAI-toepassing Microsoft Copilot

2025-02-25
Dutch IT Channel
Why's our monitor labelling this an incident or hazard?
Microsoft Copilot is an AI system used interactively for document and email processing. The privacy assessment identified risks mainly due to lack of transparency, leading to concerns about legal compliance and data protection. The municipality stopped the pilot to avoid potential privacy violations. Since no actual privacy breach or harm has occurred yet, but the risks could plausibly lead to violations of privacy rights if unmitigated, this qualifies as an AI Hazard rather than an AI Incident. The event is not merely complementary information because it reports a concrete decision based on risk assessment, nor is it unrelated as it involves an AI system and potential harm.