Intergovernmental

If reliable AI content detection tools become available, should we use them? If so, how?

Alistair Knott , Dino Pedreschi , Susan Leavy

January 24, 2025 — 8 min read

The world is being inundated with AI-generated content. A recent investigation by Wired magazine found 7% of global news stories were AI-generated, rising to 47% for Medium posts, and even 78% for specific topics. This content is generated through interactions with AI systems such as chatbots and image generators. Once produced and published, this content can take on a life of its own. It can be posted on discussion boards and social media platforms, added to websites such as Wikipedia and Stack Exchange, disseminated in newspapers and academic journals, and aired on TV or radio. From there, it can be shared and reshared without referencing its AI origins.

New information needs for content consumers

In many situations, content consumers need to know whether a human or an AI system produced a given piece of content. Or, as is increasingly the case, by some combination of the two. This is not because human-generated content is always higher quality than AI-generated content. Rather, the reasons hinge on content generation as a social practice.

Enduring content plays vital roles in the mechanisms that organise society, such as written laws, scientific articles, commercial contracts, and news reports. Because of this, content creation in many domains is closely governed by social institutions, which serve to maintain reputation and trust, ensure accuracy, and discourage plagiarism and libel.

However, the widespread use of new generative AI tools could weaken these institutions in several ways. First, they let people deliver content they didn’t produce and may not properly understand. This can be problematic in educational contexts and, more broadly, for trust in domains like commerce and research. Second, generative AI tools let people and organisations produce much more content than ever before. This is likely to have destabilising effects on information ecosystems, which is particularly concerning in areas like political opinion, financial markets, and cybersecurity.

New technologies to verify new technologies

We need to extend the current institutions governing content generation and dissemination to address these new threats. More specifically, we need new technologies to detect AI-generated content. Technology is necessary because AI content generators are now good enough to fool most human consumers most of the time. At the risk of flippancy, we might invoke some terminology from sci-fi: what we need is a version of Blade Runner’s Voight-Kampff test to distinguish between human- and AI-generated content.

Two families of technical detection methods have been developed to meet this need. One involves provenance-authentication schemes, most notably the C2PA scheme, which labels content items with protected metadata that documents their origin and history. The other consists of detection tools that analyse arbitrary content items and determine whether they originate from an AI system by searching for characteristic patterns within the content itself, seeking out patterns hidden during generation known as ‘watermarks’, or consulting logs of generated content maintained by generator providers. Collectively, we refer to these technologies as AI content detectors.

A common perception is that AI content detectors don’t work well enough to be practically usable. But recent results and developments are changing that. During last year’s US presidential election campaign, the company TrueMedia deployed a tool focussing on images in US political content, developed in close cooperation with generator companies, with a reported accuracy of around 90%. This tool will shortly be open-sourced. Google’s DeepMind recently reported a new watermarking scheme for texts that ran on a portion of queries to its Gemini text generation system, with true positive detection rates as high as 95%^[1];[JT1] their scheme has already been open-sourced.

The Wall Street Journal recently reported that OpenAI has internally experimented with a watermarking scheme with a similar high performance, though little is publicly known about it. Meanwhile, many companies are adopting the C2PA provenance authentication protocol for AI-generated content. For instance, Google recently announced it is using it for images and YouTube videos, and Amazon joined its steering committee.

It is interesting to note the diversity of methods used in these initiatives. In fact, there is a consensus that combining these detection methods will lead to further improvements similar to those seen for ensemble methods in AI classifiers.

The policy crux: should generative AI service providers be responsible for reliable detection tools?

A clear message emerging from all the aforementioned initiatives is that the companies which build and deploy generative AI tools are in the best position to deliver reliable AI content detectors. In fact, they are the only players able to report on the proportion of an AI-generated item contributed by a human author, as only these companies have access to the prompts used to generate content.

On all these grounds, our group at the Global Partnership on AI has argued that generator providers should be legally responsible for reliable content detection. Specifically, as we summarised in an article for the AI Wonk, a company developing a generative AI system should be required to demonstrate a reliable detection tool for the content it generates as a condition of its public release.

This argument had good traction with policymakers. It was discussed at the 2023 Senate Hearing on AI oversight, which furnished material for Joe Biden’s Executive Order on AI. Last year, it was incorporated into the EU’s AI Act: Article 50.2 states that providers of generative AI systems must ensure their outputs are ‘detectable as artificially generated or manipulated’.

In summary, recent technical advancements and regulatory initiatives have improved the prospects for reliable AI content detectors. To advance the conversation, we think it’s useful to imagine a world where reliable detectors are available at minimal cost.

Imagining a world with reliable AI content detection

In an article we wrote for the journal Ethics and Information Technology, we wanted to move the policy discussion about AI content detection forward by imagining a world in which reliable content detectors are readily available. How would AI content detectors be used in practice in such a scenario, which may be imminent? Which actors would use them naturally, out of self-interest? And which actors might be required to use them in the interest of stabilising the information ecosystem? Our intention is not to provide definitive answers but rather to initiate a policy discussion about this scenario so we are ready if it arrives. Here is a summary of the article’s main points.

First, we consider groups expected to adopt reliable detectors out of self-interest. Schools and universities, for instance, should be early adopters. Generative AI systems will continue to be productive tools in education, but constraints on their use will likely be beneficial in some learning and assessment contexts. Educational institutions have reputations to uphold: they must be able to vouch for their graduates, and judicious use of AI content detectors can help provide the relevant assurance. Competition in the free market provides an additional incentive: educational institutions that can provide the best guarantees will gain a competitive advantage, encouraging the adoption of relevant practices by other institutions.

Similar competitive incentives are likely to drive the adoption of AI content detectors across many sectors of the commercial world. They prove useful wherever it is essential for staff to possess personal knowledge or expertise: in hiring or contracting decisions, in consultancy relationships, in the delivery of reports and briefs, or when staff must assume personal responsibility for decisions. Furthermore, companies that make good use of AI content detectors will gain a competitive edge, so some applications are expected to become standard practice.

In some instances, the incentives for organisations to adopt detectors are not as clear. Media organisations are particularly significant due to their central role in disseminating content. Among these organisations, ‘traditional’ media companies have the most interest in identifying AI-generated material before publication, as they are unequivocally responsible for the content they distribute. In these organisations, human editors make publishing decisions. If editors can access a reliable AI content detector, they should consistently use it as part of the standard procedure for vetting submissions. Additionally, they should generally be free to publish AI-generated material, provided it meets the standards they typically impose on published items. They should also clearly label the published item as AI-generated to inform consumers.

There are a few cases where we believe publication should not be allowed. In particular, we follow the Paris Charter on AI and Journalism in arguing that multimodal AI content mimicking photographic or audiovisual recordings of actual people or events should be banned from publication—except in reports about the circulation of AI content, where the content has an important function.

New rules about content detection for social media and web search platforms?

The situation for social media companies differs significantly from that of traditional media companies. Under Section 230 of the US Communications Decency Act, these companies are famously not held responsible for the content they disseminate. As a result, it is less important for them to be perceived as trustworthy content providers. Instead, their business model hinges on maximising their user and viewer base. Consequently, social media companies have far fewer natural incentives to deploy AI content detectors. At the same time, social media companies are a natural point where the proliferation of AI-generated content can be controlled.

We believe social media companies should be required to systematically scan for AI-generated content if reliable detectors are available. Minimally, these scans could help identify and defuse AI-generated disinformation or spamming campaigns. We also argue that companies should allow users to choose how much AI-generated content they would like to see in their personal settings if they are able to do so. Again, this requires them to scan for AI content systematically.

Beyond these cases, we acknowledge that some AI-generated content is useful and should be allowed to be disseminated. Distinguishing such content from low-quality ‘slop’ is still an open matter for research in companies and beyond. However, systematically identifying AI-generated content is clearly the first step in any such process and can already be mandated.

Next, our article considers web search companies such as Google and Bing. Do these companies have an obligation to scan for AI-generated content? As providers of good search results, these companies certainly have reputations to maintain. So, they have the incentives to scan for AI-generated content systematically, and they likely already do so. For the web search domain, we suggest the appropriate level of oversight is just to monitor what companies are currently doing to deal with AI-generated content. The EU’s Digital Markets Act may provide a useful instrument for this process.

Regulating the AI content detection arms race

We conclude by considering the arms race which is starting to play out between AI content detection systems providers and those who seek to evade detection. Regulating an arms race is challenging, as successful measures and countermeasures vary over time. But in our view, the existence of an arms race is no reason for the big companies to backpedal on efforts to create reliable detectors. They have always engaged in similar adversarial interactions, notably in the area of search engine optimisation.

We think it is useful for policymakers to consider what can be done to slant the playing field towards detector providers and away from evaders. We suggest some actions to help achieve this. Some are related to funding and information dissemination. In particular, governments should help smaller generator companies implement measures supporting reliable detection of the content their systems produce. Another important measure is to restrict the dissemination of ‘frontier’ generative AI systems so that requirements for generator providers to support reliable detection can be effectively enforced.

There is an active debate between groups advocating for and against open weights generative AI models. Both sides have interesting arguments, which Sayash Kapoor and colleagues helpfully summarise. But if we wish to encourage the emergence of reliable AI content detectors, we should certainly side with commentators such as David Evan Harris, Elizabeth Seger, and colleagues who caution against open weights.

Alistair Knott

Victoria University of Wellington

Professor of Artificial Intelligence

See all posts

Dino Pedreschi

University of Pisa

Professor of Computer Science

See profile

Susan Leavy

University College Dublin

Assistant Professor

See profile

Disclaimer: The opinions expressed and arguments employed herein are solely those of the authors and do not necessarily reflect the official views of the OECD, the GPAI or their member countries. The Organisation cannot be held responsible for possible violations of copyright resulting from the posting of any written material on this website/blog.