Intergovernmental

How can standard contract terms advance responsible AI data and model sharing for generative AI and other applications?

Distinguished Faculty Fellow in Ethical Technology, Duke University Science & Society

December 8, 2023 — 6 min read

This text initially appeared as a blog post on gpai.ai, coauthored by Lee Tiedrich and Alban Avdulla

Access to data is crucial for artificial intelligence (AI) innovation, including training, testing, and validating AI models. Unlocking appropriate data in compliance with applicable laws can advance the OECD AI Principles by reducing risks of harmful bias, unfairness, hallucinations, and other unsafe outcomes. Additionally, increasing responsible AI data and model sharing can enable greater inclusive growth and competition, also central to the OECD AI Principles.

As highlighted by the G7 Hiroshima AI Process and other recent policy developments, generative AI has escalated the need to responsibly address AI data and model sharing, including for open-source large language models (LLMs). Particularly when combined with technical tools, business codes of conduct, educational programs, and training, standard contract terms potentially can help address some G7 concerns. As an example, standard contract terms can potentially provide mutually acceptable, and more efficient, alternatives to intellectual property infringement. Standard contract terms may also help foster privacy compliance and safety and increase protection for a person’s likeness and publicity rights. Equally important, standard contract terms may also help address inequities in bargaining power.

1. The Standard Contract Terms Project

The Intellectual Property Advisory Committee (the “Committee”) of the GPAI Innovation & Commercialization Working Group launched a project in 2021 exploring how standard contract terms might encourage more responsible AI data and model sharing. The project began by investigating the existing contract landscape as well as stakeholder needs, including through several stakeholder interviews.

The Committee’s 2022 Preliminary Report explained that many stakeholders want to responsibly share AI data and models, but find it challenging to do so. As discussed in the 2022 Preliminary Report, the challenges arise from the lack of regulatory harmonization and clarity, insufficient technical tools, insufficient contractual tools and codes of conduct, and difficulties valuing data and AI models. The 2022 Preliminary Report affirmed stakeholder interest in having standard contract terms. However, the report revealed that no such widely embraced standard contract terms exist, particularly for sharing AI data.

To help overcome some challenges, the 2022 Preliminary Report recommended, among other things, creating more opportunities for diverse stakeholders to share information and collaborate on developing standard contract terms. An inclusive process should lead to more informed decisions about the content and structure of such terms and ultimately, their broader acceptance and adoption.

To help accommodate the range of potential AI data and model-sharing arrangements, the report emphasized that multiple standard contract forms may be needed. The Committee encouraged stakeholders to develop a menu of different contract terms that provide the community with options. This follows the approach used for Open Source and Creative Commons license agreements and already is reflected in some ongoing efforts to develop standardized AI data licensing terms. The report noted that standard contract terms would not necessarily replace bespoke contract terms in all situations, similar to the Open Source and Creative Commons experience. However, standard contract terms could potentially streamline the negotiation of bespoke arrangements.

The report further highlighted the importance of developing technical tools and business codes of conduct to complement standard contract terms. It also underscored the importance of addressing data justice issues.

2. Building an Inclusive Ecosystem to Foster Standard AI Contract Terms

Building on the 2022 Preliminary Report, the Committee focused its efforts in 2023 on developing an inclusive ecosystem to help translate AI principles to practice by fostering the development of standard contract terms for AI data and model sharing. Specifically, the Committee crafted and hosted two hybrid multi-stakeholder workshops in 2023, one at the Max Planck Institute for Innovation and Competition in Munich, Germany, and another at Duke University in Washington, D.C, United States.

The Committee’s well-attended workshops were conducted under Chatham House Rules and included a broad range of stakeholders spanning many geographic regions and disciplines, such as lawyers, economists, and engineers, as well as policy and business experts. The workshops also drew together different viewpoints, including from civil society, academia, industry, governments, and multilateral organizations, such as the OECD, the World Intellectual Property Organization, and the World Bank.

The Committee’s 2023 Report includes the workshop agendas and background materials and summarizes the key takeaways of these events:

Demand for Standardized Contractual Terms Remains Strong, but Efforts are Still Relatively Nascent. The workshops confirmed that the demand for voluntary AI data and model as well as for standardized license terms remains strong with the rise of generative AI and other AI applications. These needs extend to the research community as well as commercial organizations, governments, and other stakeholders.
Contracts Potentially Can Help Advance AI Safety. The Committee recognizes the mounting safety concerns of openly sharing AI models (such as through open-source licensing) and the importance of developing mechanisms that effectively prevent bad actors from using open AI models for unethical, unsafe, or nefarious purposes. The Committee believes that the community should continue to consider how contracts (and corresponding contract enforcement mechanisms, business codes of conduct and technical tools) can help contribute to supporting these safety imperatives.
Standard Contract Terms Will Not Likely Be the Sole Solution. Fostering responsible AI data and model sharing will likely require an array of complementary solutions and approaches. Standard contract terms should be more effective if supported by appropriate technical tools, business codes of conduct, educational programs, training, and laws. Compliance with applicable privacy, intellectual property, and other laws remains paramount.
Evolving Legal Landscape Raises New Challenges and Underscores Potential Benefits of Standard Contractual Terms. Since the 2022 Preliminary Report, several legal and policy developments, including the EU Data Act, have underscored the benefits of developing standard contract terms. Standard contract terms may also help advance government policies for addressing responsible AI through public procurement. Additionally, they have the potential to help allocate liability and responsibilities among parties, address the allocation of intellectual property and other rights in the face of legal uncertainties, and secure trade secret protection in some situations.
Developing Common Contractual Definitions Can Help Foster the Development of Standardized Contractual Terms. Building upon the 2022 Report, many workshop participants agreed that having standard contract definitions could advance efforts to formulate standard contractual clauses for AI data and model sharing. The definitions should take into consideration relevant sources, such as evolving AI laws and policies, the OECD Framework for the Classification of AI Systems, and potentially procurement rules.
Standard Contract Terms Potentially Can Help Address Practices Involving the Ingestion of Publicly Accessible Data and Code. AI data collection has surged, including scraping from various sources like third-party websites and social media. Code scraping has also spiked. This has led to increased litigation and policy discussions, including on data use. These practices raise legal concerns in intellectual property, privacy, consumer protection, and more, with varying laws across jurisdictions. Workshop findings suggest exploring standard contract terms, business codes of conduct, technical tools, educational programs and training, and laws to enhance responsibility and clarity in using publicly accessible data and code. The Committee advocates further study on these efforts.
Multistakeholder Input is Essential. Workshop participants generally agreed that multi-stakeholder participation is essential for drafting standard contract terms that will be broadly adopted.
International Regulatory Harmonization. Workshop participants generally agreed that increasing international regulatory harmonization could facilitate responsible and voluntary AI data and model sharing.

3. Looking ahead for 2024

The Committee plans to continue its work during 2024 and launch an AI Contract Terms Incubator (AI CTI). The goal is to provide stakeholders with a forum to share ideas and get feedback from each other through virtual meetings and possibly at least one hybrid workshop. The AI CTI is still in the planning phase.

Once developed, the AI CTI could enhance opportunities for stakeholders to share draft contract terms or clauses with other AI CTI participants and solicit their comments. Furthermore, the AI CTI may serve as a forum for soliciting feedback on ideas and approaches that participants could use to develop standard contractual terms. The Committee plans to open the incubator to all interested participants worldwide, including academia, civil society, government, and industry. Parties interested in participating in the AI CTI should contact Kaitlyn Bove (kaitlyn.bove@inria.fr).

Lee Tiedrich

Duke University Science & Society

Distinguished Faculty Fellow in Ethical Technology - -

See all posts

Disclaimer: The opinions expressed and arguments employed herein are solely those of the authors and do not necessarily reflect the official views of the OECD, the GPAI or their member countries. The Organisation cannot be held responsible for possible violations of copyright resulting from the posting of any written material on this website/blog.