Preqin data

Methodological note

Background information

OECD.AI estimates venture capital (VC) financial investment in AI and data firms worldwide based on private-source data from Preqin, processed by the AI lab of the Jožef Stefan Institute, Slovenia, and analysed by the OECD. Preqin is a private company, founded in 2003, which collects data regarding private equity transactions, funds, and fund managers.

Deal information provided by Preqin includes information on the firm raising VC investments as well as on the deal itself and on the investors. Information about the firm includes the name of the company, the country where it is located, the year it was established, a description of its activities, a classification of the industries where the firms operate, as well as a set of cross-industry classifications, labelled “verticals”. Information about the deal includes the date, the stage (e.g. seed funding, round A, etc.), and the amount of the deal. Information about the investors includes their names and the country where they are located.


Determining AI deals and start-ups

An AI start-up is considered to be a private company that researches and delivers all or part of an AI system or researches and delivers products and services that rely significantly on AI systems. The definition of an AI system follows that of the revised OECD definition (2024):

“An AI system is a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment.”

A data start-up is considered to be a private company that provides solutions for large volumes of data, through data gathering, storing, or analysis.

Start-ups are identified as AI or data start-ups based on Preqin’s cross-industry and vertical categorisation, as well as on OECD’s automated analysis of the keywords contained in the description of the company’s activities. AI keywords used are of three kinds: generic AI keywords, such as “artificial intelligence” and “machine learning”; keywords pertaining to AI techniques, such as “neural network”, “deep learning”, “reinforcement learning”; and keywords referring to fields of AI applications, such as “computer vision”, “predictive analytics”, “natural language processing”, “autonomous vehicles”. Data keywords include “data management”, “data collection” and “data tracking”. Firms with keywords related to digital security, cloud computing, or telecommunications were not considered data-related firms. Furthermore, AI start-ups can be focused on generative AI, and the keywords pertaining to this field include “generative ai”, “generative artificial intelligence”, “generative adversarial network”, “creative adversarial network”, “text generation”, “image generation”, “audio generation”, “generative model”, “stable diffusion”, “chat gpt”, “creative ai”, “creative artificial intelligence”, “style transfer”, “content generation”, “creative coding”, “coding assistant” and “code generation”. AI start-ups can also be focused on compute, and the keywords pertaining to this field include “compute”, “data centre”, “semiconductor”, “GPU”, “CPU”, “high-performance compute”, “core software system”, “processor chip”, “infrastructure-as-a-service”, “neuromorphic computing”, “full-stack”, “integrated circuit”, “FPGA” and “computing chips”.

Deals reported as being “Secondary Stock Purchase”, “Mergers” or “Add-ons” were excluded from the analysis because those deals do not correspond to the financing of start-ups, i.e. where the money goes to those start-ups to develop themselves, but to a secondary market transaction where the money goes directly from one investor to another investor.

Preqin data was processed and categorised by country and industry. The industry categorisation is based on grouping 228 Preqin industry labels into 20 broader categories. The industries considered are:

  • IT infrastructure and hosting
  • Media, social platforms, marketing
  • Business processes and support services
  • Healthcare, drugs, and biotechnology
  • Robots, sensors, IT hardware
  • Financial and insurance services
  • Digital security
  • Mobility and autonomous vehicles
  • Education and training
  • Logistics, wholesale, and retail
  • Consumer products
  • Travel, leisure, and hospitality
  • Agriculture
  • Energy, raw materials, and utilities
  • Consumer services
  • Government, security, and defence
  • Environmental services
  • Construction and air conditioning
  • Real estate
  • Food and beverages

The various subsets of AI VC deals are not mutually exclusive. Indeed, deals pertaining to one of these subsets such as generative AI may face overlap with other types of deals, including compute and environmental sustainability. Similarly, all of the deals from these subsets do not add up to all of the AI identified deals. Some AI deals may not fall in one of these categories and may fall in multiple ones.

For more information and findings about venture capital investments in AI start-ups, please see www.oecd.ai/vc and: 

OECD (2021), Venture Capital Investments in Artificial Intelligence: Analysing trends in VC in AI companies from 2012 through 2020. OECD Publishing, Paris.