StyloAI: Distinguishing AI-Generated Content with Stylometric Analysis

被引：0

作者：

Opara, Chidimma ^{[1
]}

机构：

[1] Teesside Univ, Sch Comp Engn & Digital Technol, Middlesbrough, England

来源：

ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024 | 2024年 / 2151卷

关键词：

Stylometric Features; AI in Education; Natural Language Processing; ChatGPT;

D O I：

10.1007/978-3-031-64312-5_13

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The emergence of large language models (LLMs) capable of generating realistic texts and images has sparked ethical concerns across various sectors. In response, researchers in academia and industry are actively exploring methods to distinguish AI-generated content from human-authored material. However, a crucial question remains: What are the unique characteristics of AI-generated text? Addressing this gap, this study proposes StyloAI, a data-driven model that uses 31 stylometric features to identify AI-generated texts by applying a Random Forest classifier on two multi-domain datasets. StyloAI achieves accuracy rates of 81% and 98% on the test set of the AuTextification dataset and the Education dataset, respectively. This approach surpasses the performance of existing state-of-the-art models and provides valuable insights into the differences between AI-generated and human-authored texts.

引用

页码：105 / 114

页数：10

共 50 条

[1] Caution with AI-generated content in biomedicine
Zhavoronkov, Alex
[J]. NATURE MEDICINE, 2023, 29 (03) : 532 - 532
[2] Caution with AI-generated content in biomedicine
Alex Zhavoronkov
[J]. Nature Medicine, 2023, 29 : 532 - 532
[3] ChatGPT, AI-generated content, and engineering management
Zuge Yu
Yeming Gong
[J]. Frontiers of Engineering Management, 2024, 11 : 159 - 166
[4] Addressing the harms of AI-generated inauthentic content
Menczer, Filippo
Crandall, David
Ahn, Yong-Yeol
Kapadia, Apu
[J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (7) : 679 - 680
[5] Addressing the harms of AI-generated inauthentic content
Filippo Menczer
David Crandall
Yong-Yeol Ahn
Apu Kapadia
[J]. Nature Machine Intelligence, 2023, 5 : 679 - 680
[6] ChatGPT, AI-generated content, and engineering management
Yu, Zuge
Gong, Yeming
[J]. FRONTIERS OF ENGINEERING MANAGEMENT, 2024, 11 (01) : 159 - 166
[7] Auto articles: an experiment in AI-generated content
Catherine Armitage
Markus Kaindl
[J]. Nature, 2020, 588 (7837) : S138 - S141
[8] AI Usage Cards: Responsibly Reporting AI-generated Content
Wahle, Jan Philip
Ruas, Terry
Mohammad, Saif M.
Meuschke, Norman
Gipp, Bela
[J]. 2023 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, JCDL, 2023, : 282 - 284
[9] Ethical Boundaries in AI-Generated Imagery: Analysis of Controversial Content Constraints
Florindi, Emanuele
Franzoni, Valentina
Milani, Alfredo
[J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT XI, 2024, 14825 : 292 - 302
[10] An Analysis of the Copyrightability of AI-Generated Images
Zheng Xianfang
Xing Ziran
[J]. Contemporary Social Sciences., 2024, 9 (06) - 114

← 1 2 3 4 5 →