Verbal deception detection refers to techniques that enable the detection of deceptive intentions or deceptive content in written or transcribed statements. Nowadays, deception detection remains a well-known and unsolved problem with significant implications in various high-stakes contexts, including criminal investigations, financial fraud, and decep- tive behaviors in online platforms. This thesis aimed to research the ex- tent to which computational methods from artificial intelligence (AI) can be leveraged for the automated detection of verbal deception. This re- search question was addressed by investigating opportunities (Chapters 2, 3) and challenges (Chapters 3, 4, and 5) for automated verbal decep- tion detection. Chapter 1 presents an overview of automated verbal deception detection by systematically reviewing 248 papers and 5,148 machine learning (ML) models. Findings revealed that deception detection research is undergo- ing a technological shift, transitioning from training statistical models on low-level features (e.g., word frequency, part-of-speech, word statistics) to fine-tuning language models on high-level features (e.g., semantics). However, to advance the reliability and applicability of such models in real-life contexts, important limitations still concern the establishment of a clear ground truth, the investigation of deceptive strategies beyond fabrication, and a sufficient out-of-domain generalization performance. To further explore the topic of automated verbal deception detection, Chapters 2 and 3 investigate its opportunities for automated coding of statements and prediction of deception. Specifically, Chapter 2 closely compares the performance of naïve judges and expert judges trained on Reality Monitoring with that of theory-led and data-driven machine learning (ML) models in detecting verbal deception. Findings showed that both theory-led (accuracy=69.4%) and data-driven (accu- racy=77.3%) ML algorithms significantly outperformed naïve (accu- racy=54.7%) and expert judges (accuracy=59.4%), suggesting that such models may represent a valid alternative when psychological manual approaches to deception fall short. Chapter 3 builds upon these findings and investigates whether fine-tuning a large language model for decep- tion detection is effective and robust in cross-domain detection (Chapter 3). Findings revealed that LLMs outperform previous models when trained on a single dataset or a combination of them, reaching an accu- racy of up to 79.31%. However, the accuracy rate dropped dramatically to chance level when the model was tested on a novel dataset. These re- sults revealed that there is not a “universal rule” for deception and that previous exposure to deceptive examples is necessary to achieve forms of generalization.Conversely, Chapters 4 and 5 provide more critical insights into the chal- lenges associated with such automated methods. One limitation of pre- vious studies is that they mostly investigate deception in fabricated statements, overlooking that a more ecological form of deception in- volves the incorporation of deceptive information into truthful state- ments. This resulting type of deception is known as embedded lies. By collecting 2,088 truthful and deceptive statements with annotated em- bedded lies, findings showed that, this time, a fine-tuned language model (Llama-3-8B) could detect embedded lies with 64% accuracy at best. Additional findings on individual differences, linguistic properties, and explainability analysis revealed that embedded lies pose a signifi- cant challenge for automated verbal deception detection (and also for deception detection in general), due to their incorporation of truthful in- formation. Finally, with the vision that automated methods may be inte- grated into real-life settings to aid experts in deception detection, Chap- ter 5 investigates the extent to which humans would endorse such algo- rithmic predictions. With only a few available studies on hybrid deci- sion-making, this chapter developed a behavioral experiment to exam- ine how humans rely on or reject AI predictions based on varying de- grees of information about the AI model (i.e., accuracy and confidence). Findings showed that the model’s accuracy played a role, as humans fol- lowed predictions from a highly accurate model more than from a less accurate one. Additionally, confidence had an unexpected effect: human judgments deviated more from highly confident AI predictions, espe- cially if the model predicted deception. Ultimately, human interaction with algorithmic predictions either hindered the machine’s performance or was ineffective. These results on human aversion to AI judgments provide practical insights and limitations for future integrations of hu- man oversight in algorithmic decision-making. Altogether, the findings from these five studies contribute to highlight- ing not only contexts where computational methods clearly outperform human performance but also their current methodological, conceptual, and practical shortcomings, underscoring the conditions under which model application in real-life settings is constrained.

Opportunities and Challenges of Automated Verbal Deception Detection / Loconte, R.. - (2026 Jun 05). [10.13118/loconte-riccardo_phd2026-06-05]

Opportunities and Challenges of Automated Verbal Deception Detection

Loconte Riccardo
2026

Abstract

Verbal deception detection refers to techniques that enable the detection of deceptive intentions or deceptive content in written or transcribed statements. Nowadays, deception detection remains a well-known and unsolved problem with significant implications in various high-stakes contexts, including criminal investigations, financial fraud, and decep- tive behaviors in online platforms. This thesis aimed to research the ex- tent to which computational methods from artificial intelligence (AI) can be leveraged for the automated detection of verbal deception. This re- search question was addressed by investigating opportunities (Chapters 2, 3) and challenges (Chapters 3, 4, and 5) for automated verbal decep- tion detection. Chapter 1 presents an overview of automated verbal deception detection by systematically reviewing 248 papers and 5,148 machine learning (ML) models. Findings revealed that deception detection research is undergo- ing a technological shift, transitioning from training statistical models on low-level features (e.g., word frequency, part-of-speech, word statistics) to fine-tuning language models on high-level features (e.g., semantics). However, to advance the reliability and applicability of such models in real-life contexts, important limitations still concern the establishment of a clear ground truth, the investigation of deceptive strategies beyond fabrication, and a sufficient out-of-domain generalization performance. To further explore the topic of automated verbal deception detection, Chapters 2 and 3 investigate its opportunities for automated coding of statements and prediction of deception. Specifically, Chapter 2 closely compares the performance of naïve judges and expert judges trained on Reality Monitoring with that of theory-led and data-driven machine learning (ML) models in detecting verbal deception. Findings showed that both theory-led (accuracy=69.4%) and data-driven (accu- racy=77.3%) ML algorithms significantly outperformed naïve (accu- racy=54.7%) and expert judges (accuracy=59.4%), suggesting that such models may represent a valid alternative when psychological manual approaches to deception fall short. Chapter 3 builds upon these findings and investigates whether fine-tuning a large language model for decep- tion detection is effective and robust in cross-domain detection (Chapter 3). Findings revealed that LLMs outperform previous models when trained on a single dataset or a combination of them, reaching an accu- racy of up to 79.31%. However, the accuracy rate dropped dramatically to chance level when the model was tested on a novel dataset. These re- sults revealed that there is not a “universal rule” for deception and that previous exposure to deceptive examples is necessary to achieve forms of generalization.Conversely, Chapters 4 and 5 provide more critical insights into the chal- lenges associated with such automated methods. One limitation of pre- vious studies is that they mostly investigate deception in fabricated statements, overlooking that a more ecological form of deception in- volves the incorporation of deceptive information into truthful state- ments. This resulting type of deception is known as embedded lies. By collecting 2,088 truthful and deceptive statements with annotated em- bedded lies, findings showed that, this time, a fine-tuned language model (Llama-3-8B) could detect embedded lies with 64% accuracy at best. Additional findings on individual differences, linguistic properties, and explainability analysis revealed that embedded lies pose a signifi- cant challenge for automated verbal deception detection (and also for deception detection in general), due to their incorporation of truthful in- formation. Finally, with the vision that automated methods may be inte- grated into real-life settings to aid experts in deception detection, Chap- ter 5 investigates the extent to which humans would endorse such algo- rithmic predictions. With only a few available studies on hybrid deci- sion-making, this chapter developed a behavioral experiment to exam- ine how humans rely on or reject AI predictions based on varying de- grees of information about the AI model (i.e., accuracy and confidence). Findings showed that the model’s accuracy played a role, as humans fol- lowed predictions from a highly accurate model more than from a less accurate one. Additionally, confidence had an unexpected effect: human judgments deviated more from highly confident AI predictions, espe- cially if the model predicted deception. Ultimately, human interaction with algorithmic predictions either hindered the machine’s performance or was ineffective. These results on human aversion to AI judgments provide practical insights and limitations for future integrations of hu- man oversight in algorithmic decision-making. Altogether, the findings from these five studies contribute to highlight- ing not only contexts where computational methods clearly outperform human performance but also their current methodological, conceptual, and practical shortcomings, underscoring the conditions under which model application in real-life settings is constrained.
5-giu-2026
37
CCSN
PIETRINI, PIETRO
Prof. Bennett Kleinberg (Tilburg University)
File in questo prodotto:
File Dimensione Formato  
PhD Thesis Riccardo Loconte.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 5.71 MB
Formato Adobe PDF
5.71 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/42218
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • OpenAlex ND
social impact