
Artificial Intelligence (AI) now more persuasive than humans in debates?
The findings of the study conducted by the Complex Human Behaviour Laboratory at FBK, the École Polytechnique Fédérale de Lausanne (Switzerland), and Princeton University (USA) were published in the prestigious journal Nature Human Behaviour.
After two years of experimentation, the first scientific results show that Large Language Models (LLMs) — such as ChatGPT, Gemini, and Claude — are capable of generating highly persuasive arguments. The study involved 900 U.S. participants recruited via the academic research platform Prolific.
Unlike previous studies, which focused on individual messages without context, this research is the first to analyze persuasive performance in real conversations — the setting for which AI chatbots are typically optimized. Led by the École Polytechnique Fédérale de Lausanne, Princeton University, and the FBK Center for Digital Society with Riccardo Gallotti, the study demonstrates that conversational AI can outperform humans in debate-like exchanges. The intuition earned the researchers a publication in Nature Human Behaviour, an internationally prestigious journal with an impact factor of 30.
The experiment was based on the simulation of “debate competitions” a practice commonly found in the United States in which individuals or teams of students compete to argue persuasively on randomly assigned topics.
A total of 900 U.S. participants were recruited for the study via Prolific, a crowdsourcing platform for academic research. Each participant received financial incentives to encourage engagement and was assigned to a discussion either with a customized version of an AI chatbot based on GPT-4 or with a human counterpart. These discussions consisted of three brief exchanges lasting approximately 10 minutes. All participants were aware that they were taking part in a controlled experimental setting. The discussion topics were randomly selected from a set of 30 statements addressing prominent political and social issues in the U.S., organized into three tiers of argumentative “strength.”
“The main finding reveals that GPT-4, even when provided with only minimal personal information about participants, demonstrated superior persuasive capabilities compared to human debaters. In cases where there was a clear difference in persuasive effectiveness between the two, the artificial intelligence system was more convincing in 64.4% of cases”, FBK researcher Riccardo Galotti of FBK explained. “
Another result concerns the perception of the opponent’s identity: participants were able to recognize that they were interacting with a chatbot in three out of four cases, whereas they were less accurate in identifying human interlocutors – with a success rate no better than chance.
An intriguing psychological effect also emerged: when participants believed they were speaking with an AI, they were more inclined to change their minds or agree with their opponent than when they assumed the opponent was human. Nonetheless, the researchers caution that it remains unclear whether this difference in attitude stems from participants’ beliefs about their opponent’s nature, or if, conversely, those beliefs are influenced by the act of changing one’s opinion.
In any case, perception of the opponent alone does not fully account for the results, which appear to be driven primarily by the AI’s ability to produce more compelling and persuasive arguments.
“As a working group, we argue that online platforms and social media should take these threats seriously and strengthen their efforts to implement safeguards against AI-driven persuasion,” Gallotti added. “ Moreover, we believe a promising strategy to counter large-scale disinformation campaigns could involve the very same large language models—by generating personalized counter-narratives to educate bystanders who may be vulnerable to misleading content. At FBK, we are actively pursuing this direction through the AI4TRUST project . Early results are encouraging, showing a reduction in conspiratorial beliefs following dialogue-based interventions with AI.”