Preqin data

Methodological note

OECD.AI estimates venture capital (VC) financial investment in AI firms worldwide based on private-source data from Preqin, processed by the AI lab of the Jožef Stefan Institute, Slovenia and analysed by the OECD. Preqin is a private company, founded in 2003, which collects data regarding private equity transactions, funds and fund managers.

Deal information provided by Preqin includes information on the firm raising VC investments as well as on the deal itself and on the investors. Information about the firm include the name of the company, the country where it is located, the year it was established, a description of its activities, a classification of the industries where the firms operate, as well as a set of cross-industry classifications, labelled “verticals”. Information about the deal include the date, the stage (e.g. seed funding, round A, etc.), and the amount of the deal. Information about the investors include their name and the country where they are located.

Approximately 170 000 VC deals were reported to have taken place between 2012 and 2020. Of those deals, 20 942 were categorised as VC investments concerning an AI firm. An AI start-up is considered to be a private company that researches and delivers all or part of an AI system or researches and delivers products and services that rely significantly on AI systems. The definition of an AI system follows that of the OECD principles: “An AI system is a machine-based system that is capable of influencing the environment by making recommendations, predictions or decisions for a given set of objectives. It does so by utilising machine and/or human-based inputs/data to: i) perceive real and/or virtual environments; ii) abstract such perceptions into models manually or automatically; and iii) use Model Interpretations to formulate options for outcomes.”

Start-ups are identified as AI start-ups based on Preqin’s manual categorisation (those represent 84% of the monetary value of transactions included), as well as on OECD’s manual categorisation (2% of the monetary value of transactions included) and OECD’s automated analysis of the keywords contained in the description of the company’s activity categorisation (14% of the monetary value of transactions included). Keywords used are of three kinds: generic AI keywords, such as “artificial intelligence” and “machine learning”; keywords pertaining to AI techniques, such as “neural network”, “deep learning”, “reinforcement learning”; and keywords referring to fields of AI applications, such as “computer vision”, “predictive analytics”, “natural language processing”, “autonomous vehicles”.

Of all deals with an AI firm, 20 549 were considered as VC investments funding an AI firm. Deals reported as being “Secondary Stock Purchase”, “Mergers” or “Add-ons” were excluded from the analysis because those deals do not correspond to financing of start-ups, i.e. where the money goes to those start-ups to develop themselves, but to secondary market transaction where money goes directly from one investor to another investor.

Preqin data was processed and categorised by country and industry. The industry categorisation is based on grouping 228 Preqin industry labels into 20 broader categories. The full list of industry categories with the number of deals corresponding to each category is presented in (Figure 1).

Figure 1. List of industry categories with number of deals from 2012-2020

Source: OECD.AI (2021), processed by JSI AI Lab, Slovenia, based on Preqin data of 23/04/2021

Many of the reported investment transactions (deals) from 2012 through 2020 do not include the amount invested, e.g. 18% of the deals for US start-ups and 63% for Chinese start-ups. Where possible, an estimate of missing amounts was calculated based on the median amount of comparable clusters of deals per country of the start-up, investment year and investment stage. The impact of estimating amounts for which the amount is missing increases the total value of VC investments by 7% between 2012 and 2020, with an increase of 5% for US-based start-ups and 11% for start-ups based in China over the period (Figure 9.2).

When considering the origin of the financing, about 16% of deals have no investor identified. The estimates prorate the value of those deals to the different countries in the sample following the distribution of deals with reported investors.

When a single round of financing includes multiple investors, Preqin data does not specify how much each investor has contributed. For such deals, the invested value is split equally between investors.

For more information and findings about venture capital investments in AI start-ups, please see OECD (2021).

Figure 2. Impact of estimating amounts relative to amounts reported per country

VC deals with AI firms from 2012 to 2020; excluding countries with less than 5 reported deals

Source: OECD.AI (2021), processed by JSI AI Lab of Slovenia, based on Preqin data of 23/04/2021,


OECD (2021), Venture Capital Investments in Artificial Intelligence: Analysing trends in VC in AI companies from 2012 through 2020. OECD Publishing, Paris.