LinkedIn data

Methodological note

Country sample

Included countries represent a select sample of eligible countries with at least 40% labour force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as other countries, and should be interpreted accordingly.

AI skills

LinkedIn members self-report their skills on their LinkedIn profiles. Currently, more than 38 000 distinct, standardised skills are identified by LinkedIn. These have been coded and classified by taxonomists at LinkedIn into 249 skill groupings, which are the skill groups represented in the dataset. The top skills that comprise the AI skill grouping are machine learning, natural language processing, data structures, artificial intelligence, computer vision, image processing, deep learning, TensorFlow, Pandas (software) and OpenCV, among others.

Skill groupings are derived by expert taxonomists through a similarity-index methodology that measures skill composition at the industry level. Industries are classified according to the ISIC 4 industry classification (Zhu et al., 2018).

Skills Genome

For any entity (occupation or job, country, sector, etc.), the skill genome is an ordered list (a vector) of the 50 ‘most characteristic skills’ of that entity. A TF-IDF algorithm – short for ‘term frequency–inverse document frequency’ – is used to identify the most representative skills of each entity and down-rank ubiquitous skills that add little information about that specific entity (e.g. Microsoft Word).

TF-IDF is a statistical measure that evaluates how important a word is to a document in a collection or corpus of documents (Rajaraman and Ullman, 2011). In this case, the TF-IDF algorithm is used to evaluate how representative a skill is to a selected entity. This is done by multiplying two metrics:

  1. The term frequency of a skill in an entity (‘TF’).
  2. The logarithmic inverse entity frequency of the skill across a set of entities (‘IDF’). This indicates how common or rare a word is in the entire entity set. The closer IDF is to 0, the more common a word is.

If the skill is very common across LinkedIn entities, and appears in many job or member descriptions, the IDF will approach 0. If, on the other hand, the skill is unique to specific entities, the IDF will approach 1. More details are available at LinkedIn’s Skills Genome and LinkedIn-World Bank Methodology.

AI skills penetration 

The aim of this indicator is to measure the intensity of AI skills in an entity (in a particular country, industry,gender, etc.) through the following methodology:

  • Compute frequencies for all self-added skills by LinkedIn members in a given entity (occupation, industry, etc.) in 2015-2020.
  • Re-weight skill frequencies using a TF-IDF model to get the top 50 most representative skills in that entity. These 50 skills compose the “skill genome” of that entity.
  • Compute the share of skills that belong to the AI skill group out of the top skills in the selected entity.

Interpretation: The AI skill penetration rate signals the prevalence of AI skills across occupations, or the intensity with which LinkedIn members utilize AI skills in their jobs. For example, the top 50 skills for the occupation of “Engineer” are calculated based on the weighted frequency with which they appear in LinkedIn members’ profiles. If four of thes skills that engineers possess belong to the AI skill group, then this measure indicates that the penetration of AI skills is estimated to be 8% among engineers (i.e. 4/50).

Relative AI skills penetration 

To allow for skills penetration comparisons across countries, the skills genomes are calculated and a relevant benchmark is selected (e.g. OECD or G20 average). A ratio is then constructed between a country’s and the benchmark’s AI skills penetrations, controlling for occupations.

Note that a country’s AI penetration is benchmarked by considering only the overlapping occupations between this country and the benchmark set. For example, if Peru has only 2 occupations with AI skills, we calculate the benchmarks only using those two occupations.

Interpretation: A country’s relative AI skills penetration of 1.5 indicates that AI skills are 1.5 times as frequent as in the benchmark, for an overlapping set of occupations.

Relative AI skills penetration by country

For cross-country comparison, we present the relative penetration rate of AI skills, measured as the sum of the penetration of each AI skill across occupations in a given country, divided by the average global penetration of AI skills across the overlapping occupations in a sample of countries.

Interpretation: A relative penetration rate of 2 means that the average penetration of AI skills in that country is two times the benchmark average across the same set of occupations.

Relative AI skills penetration by industry 

The relative AI skills penetration by country and industry provides an in-depth sectoral decomposition of AI skill penetration across industries and sample countries.

Interpretation: A country’s relative AI skill penetration rate of 2 in the education sector means that the average penetration of AI skills in that country is two times the benchmark average across the same set of occupations in that sector.

Relative AI skills penetration by gender 

The relative AI skills penetration by country and gender provides an in-depth sectoral decomposition of AI skill penetration across genders and sample countries.

Interpretation: A country’s relative AI skill penetration rate for women equal to 2 means that the average penetration of AI skills among women in that country is 2 times the benchmark average for women across the same set of occupations.

Relative AI skills penetration: country rankings over time 

The ranking is calculated by estimating the ratio between a country’s AI skills penetration and the average AI skills penetration of all countries in the sample, controlling for occupations.

Top AI skills worldwide 

AI skills most frequently added by members during 2015–2021 period.

Fastest growing AI skills 

Top 10 fastest growing AI skills (year-over-year growth rates) in global LinkedIn member profiles.

AI jobs or occupations

LinkedIn member titles are standardised and grouped into approximately 15 000 occupations. These are not sector or country specific. These occupations are further standardised into approximately 3 600 occupation representatives or ‘jobs’. Occupation representatives group occupations with a common role and specialty, regardless of seniority.

An ‘AI job’ is an occupation representative that requires AI skills to perform the job. Skills penetration is used as a signal for whether AI skills are prevalent in an occupation representative, in any sector where the occupation representative may exist. Examples of such occupations include Machine Learning Engineer, Artificial Intelligence Specialist, Data Scientist, and Computer Vision Engineer.

AI talent

A LinkedIn member is considered AI talent if they have explicitly added AI skills to their profile and/or they are occupied in an AI job.

AI talent concentration

The counts of AI talent are used to calculate talent concentration metrics. For example, AI talent concentration at the country level is calculated using the counts of AI talent vis-a-vis the counts of LinkedIn members in that country. As such, AI talent concentration metrics may be influenced by a country’s LinkedIn coverage and should be used with caution. For example, as of 2021 1 in every 10 LinkedIn members in India is classified as AI talent, which is a result of LinkedIn’s biased coverage in that country.

Since it also encompasses LinkedIn members with AI job titles – as opposed to only LinkedIn members with AI skills on their profiles – AI talent is considered to be a more comprehensive measure than AI skills.

The aggregates displayed in the “AI talent concentration by industry and gender” chart include data from G20 countries, OECD member countries, Singapore, Hong Kong (China), United Arab Emirates, Cyprus, Uruguay, and Costa Rica.

AI talent migration 

Data on AI skills migration comes from the World Bank Group-LinkedIn “Digital Data for Development” partnership. Please see https://linkedindata.worldbank.org/ and Zhu et al. (2018) for more information. 

LinkedIn migration rates are derived from the self-identified locations of LinkedIn member profiles. For example, when a LinkedIn member updates his or her location from Paris to London, this is counted as a migration. Migration data is available from 2019 onwards.

LinkedIn data provide insights to countries on the AI talent gained or lost due to migration trends. AI Talent migration is considered for all members with AI skills/holding AI jobs at time t for country A as the country of interest and country B as the source of inflows and destination for outflows. Thus, net AI Talent migration between country A and country B – for country A – is calculated as follows:

Net flows are defined as total arrivals minus departures within the given time period. LinkedIn membership varies considerably between countries, which makes interpreting absolute movements of members from one country to another difficult. To compare migration flows between countries fairly, migration flows are normalised for the country of interest. For example, if country A is the country of interest, all absolute net flows into and out of country A, regardless of origin and destination countries, are normalised based on LinkedIn membership in country A at the end of each year and multiplied by 10 000. Hence, this metric indicates relative talent migration from all countries to and from country A.

Note that from 2019 onwards, this new and more comprehensive measure of AI talent – which also considers LinkedIn members with AI job titles, as opposed to only LinkedIn members with AI skills on their profiles – is used to measure migration. Thus, caution is advised when comparing migration figures before and after 2019.

Relative AI hiring index

The AI hiring over time chart on OECD.AI indicates the rate of hiring in the AI field for each country, compared to the overall hiring in that country.

  • The LinkedIn Hiring Rate or Overall Hiring Rate is a measure of hires normalised by LinkedIn membership. It is computed as the percentage of LinkedIn members who added a new employer in the same period the job began, divided by the total number of LinkedIn members in the corresponding location.
  • The AI Hiring Rate is computed following the overall hiring rate methodology, but only considering members classified as AI talent.
  • The Relative AI Hiring Index is the pace of change in AI Hiring Rate normalised by the pace of change in Overall Hiring Rate, providing a picture of whether hiring of AI talent is growing at a higher, equal or lower rate than overall hiring in a market. The relative AI Hiring Index is equal to 1.0 when AI hiring and overall hiring are growing at the same rate year-on-year.

Interpretation: The Relative AI Hiring Index shows how fast each country is experiencing growth in AI talent hiring relative to growth in overall hiring in the country. A ratio of 1.2 means the growth in AI talent hiring has outpaced the growth in overall hiring by 20%.

References

Rajaraman, A. and Ullman, J. (2011), Mining of Massive Datasets, pp. 1–17, https://doi.org/10.1017%2FCBO9781139058452.002, ISBN 978-1-139-05845-2.

Zhu, T.; Fritzler, A.; and Orlowski, J. (2018). World Bank Group-LinkedIn Data Insights: Jobs, Skills and Migration Trends Methodology and Validation Results (English). Washington, D.C.: World Bank Group. http://documents.worldbank.org/curated/en/827991542143093021/World-Bank-Group-LinkedIn-Data-Insights-Jobs-Skills-and-Migration-Trends-Methodology-and-Validation-Results.