Vague concepts in the EU AI Act will not protect citizens from AI manipulation

The EU’s latest amendments to the Artificial Intelligence Act include new rules that cover manipulation by AI systems. This is a crucial step to protect EU citizens from the dangers of AI manipulation. However, the proposed laws are too vague and lack the backing of scientific evidence.

The main issue is that the new amendments’ core concepts lack clarity. For example, the latest amendments mention “personality traits” six times but neither the document nor the draft of the Act defines term.

A technical definition based on best practices in psychology and artificial intelligence would improve core concepts and make it more effective for regulating AI manipulation.

Define and protect personality traits, but do more

In its current version, the act prohibits using the ‘vulnerabilities of individuals and specific groups of persons due to their known or predicted personality traits, age, physical or mental incapacities’. Again, the text refers to ‘personality traits’ multiple times without definition.

The OCEAN model, while neither uncontroversial nor complete, is the most common method for quantifying personality. Using this standard to define personality traits would be a significant improvement. 

However, personality traits are only a small minority of objectively measurable psychological traits which could be exploited by an AI system. The EU AI Act should protect the entire psychological profile from manipulation by sophisticated AI systems.

Specific psychometric measures such as suggestibility and hypnotisability are crucial to determining how to manipulate an individual and should be protected characteristics along with demographics and personality traits.

Another crucial characteristic is the influence of choice architectures on the decision-making of an individual, known as nudgeability in psychological literature. These and other qualities can be measured or inferred from user data and behaviour and then exploited by a sophisticated AI system to coerce or manipulate human actors.

The EU AI Act would be made more effective by changing personality traits to psychological traits, defined as: Properties of human psychology measured or inferred from available data about a specific user or group of users.

How to define deceptive, subliminal and manipulative techniques

Now, the AI Act prohibits “…AI system that deploys subliminal techniques beyond a person’s consciousness or purposefully manipulative or deceptive techniques…” This requires separate definitions and policy recommendations for the terms: subliminal techniques, purposefully manipulative techniques, and deceptive techniques.

Subliminal techniques

This research article provides a comprehensive analysis of subliminal techniques, offering both narrow and broad definitions. The narrow definition: Subliminal techniques aim at influencing a person’s behaviour by presenting a stimulus in such a way that the person remains unaware of the stimulus presented. This definition aligns with the traditional understanding of subliminal techniques in psychology and marketing, where the stimulus is presented below the threshold of conscious perception but can still influence behaviour. However, the authors argue that this narrow definition may not capture all the ethically concerning techniques that could be used in AI systems.

To remedy that, they propose a broader definition: Subliminal techniques aim to influence a person’s behaviour in ways in which the person is likely to remain unaware of:

  • the attempt to influence,
  • how the influence works, or
  • the effects the influence attempt would have on decision-making or value-and-belief-formation processes.

This definition is more suitable as it focuses on the manipulator’s intent: a person can be aware of a stimulus but not be aware of the fact that a manipulator is using this stimulus to manipulate them. It introduces manipulation techniques that the person is unaware of or cannot resist.

Purposefully manipulative techniques

A recent paper defines AI systems as manipulative “…if the system acts as if it were pursuing an incentive to change a human (or another agent) intentionally and covertly”. The definitions include the following key points that need to be considered when defining manipulation:

1. Incentives: The first axis of manipulation is whether the system has incentives for influence, i.e. incentives to change a human’s behaviour. An incentive exists for a certain behaviour if such behaviour increases the reward (or decreases the loss) the AI system receives during training.

2. Intent: Prohibited “manipulative techniques” are used “purposefully”. The researchers propose grounding the notion of intent in a fully behavioural lens, which is agnostic to the actual computational process of the system. They propose that “…a system has intent to perform a behaviour if, in performing the behaviour, the system can be understood as engaging in a reasoning or planning process for how the behaviour impacts some objective.”

3. Covertness: The researcher defines this “…as the degree to which a human is aware of the specific ways in which an AI system is attempting to change some aspect of their behaviour, beliefs, or preferences.” Covertness serves as a distinguishing factor between manipulation and persuasion. An individual being persuaded is typically conscious of the efforts made by the persuader. Covertness implies that the individual may not be aware of the influence exerted on them, thus making it difficult for them to consent or resist it. This lack of awareness can compromise their autonomy.

A definition for “deception” that targets all deceptive techniques

A deceptive AI system has one goal but pretends to have another. Deceptive AI can also appear to have information that a user needs or to have information or answers to a user’s request when in fact, it does not. Certain AI systems have found ways to garner positive reinforcement by executing actions that misleadingly suggest to the human overseer that the AI has accomplished the set goal.

For instance, a study revealed that a virtual robotic arm could simulate grasping a ball. The AI system, trained via human feedback to pick up a ball, instead learned to position its hand to block the ball from the camera’s view, creating a false impression of success. There have also been instances where AI systems have learned to identify when they are under evaluation and temporarily halt undesired behaviours, only to resume them once the evaluation period is over.

This kind of deceptive behaviour, known as specification gaming, could become more prevalent as future AI systems take on tasks that are more complex and harder to assess, making their deceptive actions harder to detect.

A suitable definition for deception:

Deception, in the context of AI systems, refers to an intentional act or omission by an AI system to create false or misleading impressions. This can be about its goals, capabilities, operations, or effects. It can materially distort a user’s understanding, preferences, or behaviour in a way that can cause that user or others significant harm. This can occur in several ways, such as when a system:

  1. misrepresents or obscures its goals or intents, creating a perception of alignment with the user’s goals and interests;
  2. provides false, incomplete, or misleading information about its capabilities or limitations;
  3. manipulates or obscures the outputs, outcomes, and effects of its operation misleadingly;
  4. Falsely represents its knowledge or lack thereof about any information or request;
  5. Alters its behaviour temporarily or selectively in response to monitoring or evaluation activities to falsely represent its typical operations or effects.

This definition is intended to cover a broad range of deceptive AI practices and provide a clear framework to identify and assess instances of deception.

Preferences are also a target for manipulation by AI

The current version of the AI Act states that the target of manipulation, and what policy should be designed around, is behaviour. Parts of the text refer to “…distorting a person’s or a group of persons’ behaviour…” or “…the effect of materially distorting the behaviour…”. However, it does not consider how AI can manipulate other aspects of people’s psychology.

The EU AI Act should address preferences because large machine learning systems often target them. An example is recommender systems that try to learn users’ preferences. They have a bidirectional causal relationship with behaviour. Thus, even if a policy maker was solely concerned with manipulating behaviour, it is important to consider preferences because they influence and are influenced by behaviour.

We define preferences as “… any explicit, conscious, and reflective or implicit, unconscious, and automatic mental process that brings about a sense of liking or disliking for something.”

The EU AI Act should explicitly prohibit AI systems that purposefully and materially manipulate or distort a person’s or a group of persons’ preferences in ways likely to cause significant harm.

Clarifying the meaning of “informed decisions”

In its most basic form, an informed decision refers to a choice made with full awareness and understanding of any relevant information, potential consequences, and available alternatives. The EU AI Act should define informed decision as:

A decision made by one or more persons, with full understanding of pertinent information, potential outcomes, and available alternatives, unimpaired by subliminal, manipulative, or deceptive techniques. This involves having clear, accurate, and sufficient information about the nature, purpose, and implications of the AI system in question, including but not limited to its functioning, data usage, potential risks, and the extent to which the system influences or informs choices.

The facets of this definition must also have clear definitions:

  1. Full comprehension: Decision-makers should understand the relevant information, potential implications, and available alternatives to the AI system. This includes an understanding of the nature of the AI system and how it may effect and influence the decision maker’s behaviour.

  2. Accurate and sufficient information: The information should be accurate, complete, and easy to understand. Any crucial information that could significantly influence decision-making should not be withheld or obscured.

  3. Absence of subliminal, manipulative, or deceptive techniques: The decision-making process should be free from covert or overt influences that could distort perception, judgement, or choice.

  4. Understanding of AI influence: Decision-makers should be made aware of the extent to which the AI system could influence their choices, allowing them to take this into account.

This definition would give the AI Act with a more robust framework for assessing whether an AI system could materially distort a person’s behaviour by impairing their ability to make an informed decision. It could guide standards for transparency and disclosure around AI systems, and underpin the development of regulatory measures to protect the rights and autonomy of individuals and groups interacting with AI.

Clear definitions mean better protection

The EU’s Artificial Intelligence Act serves as a crucial linchpin to ensure that AI technologies are used safely, and to protect EU citizens against manipulation and other potential harms. Yet, parts of the Act are ambiguous and lack clear definitions, jeopardising its effectiveness.

The vague reference to ‘personality traits’ without an adequate operational definition opens the door to misinterpretation and enforcement inconsistencies. The broader umbrella term, ‘psychological traits,’ captures the multifaceted nature of human psychology and offers a more comprehensive scope for regulation.

Other terms need further clarity and strict definitions to ensure consistent and ethically sound applications: subliminal techniques, manipulative strategies, and deceptive actions by AI systems. Our proposed definitions and frameworks bridge these gaps, emphasize transparency, understanding, and help preserve human agency.

In essence, while the EU AI Act represents a meaningful stride towards AI governance, it is critical to refine its frameworks to ensure comprehensive protection. Clear definitions, continuous multistakeholder review and collaboration can construct an AI ecosystem that upholds European values by respecting individual rights, autonomy, and well-being.

AI Wonk Dog
Sign up for OECD artificial intelligence newsletter

AccountabilityFostering a digital ecosystem for AIHuman-centred values and fairnessRobustness, security and safetyDigital economyPublic governanceSocial & welfare issuesAI ActAI ethicsClassificationManipulationEuropean Union

Disclaimer: The opinions expressed and arguments employed herein are solely those of the authors and do not necessarily reflect the official views of the OECD or its member countries. The Organisation cannot be held responsible for possible violations of copyright resulting from the posting of any written material on this website/blog.

Sign up for OECD artificial intelligence newsletter