FI Group at SemEval-2024 Task 8: A Syntactically Motivated Architecture for Multilingual Machine-Generated Text Detection

被引:0
|
作者
Ben-Fares, Maha [1 ,2 ]
Zaratiana, Urchade [2 ,3 ]
Hernandez, Simon D. [2 ]
Holat, Pierre [2 ,3 ]
机构
[1] CY Cergy Paris Univ Pontoise, ETIS, Cergy, France
[2] FI Grp, Puteaux La Defense, France
[3] Univ Sorbonne Paris Nord, LIPN, Villetaneuse, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present the description of our proposed system for Subtask A - multilingual track at SemEval-2024 Task 8, which aims to classify if text has been generated by an AI or Human. Our approach treats binary text classification as token-level prediction, with the final classification being the average of token-level predictions. Through the use of rich representations of pre-trained transformers, our model is trained to selectively aggregate information from across different layers to score individual tokens, given that each layer may contain distinct information. Notably, our model demonstrates competitive performance on the test dataset, achieving an accuracy score of 95.8%. Furthermore, it secures the 2nd position in the multilingual track of Subtask A, with a mere 0.1% behind the leading system.
引用
收藏
页码:1166 / 1171
页数:6
相关论文
共 50 条
  • [31] L3i++ at SemEval-2024 Task 8: Can Fine-tuned Large Language Model Detect Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text?
    Hanh Thi Hong Tran
    Tien Nam Nguyen
    Doucet, Antoine
    Pollak, Senja
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 13 - 21
  • [32] Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text
    Ebrahimi, Seyedeh Fatemeh
    Azari, Karim Akhavan
    Iravani, Amirmasoud
    Qazvini, Arian
    Sadeghi, Pouya
    Taghavi, Zeinab Sadat
    Sameti, Hossein
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 565 - 572
  • [33] RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts
    Kadiyala, Ram Mohan Rao
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 511 - 519
  • [34] PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?
    Petukhova, Kseniia
    Kazakov, Roman
    Kochmar, Ekaterina
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1140 - 1147
  • [35] IUSTNLPLAB at SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes
    Osoolian, Mohammad
    Monazzah, Erfan Moosavi
    Eetemadi, Sauleh
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1092 - 1096
  • [36] NootNoot at SemEval-2024 Task 8: Fine-tuning Language Models for AI vs Human Generated Text detection
    Bahad, Sankalp
    Bhaskar, Yash
    Krishnamurthy, Parameswari
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 918 - 921
  • [37] Magnum JUCSE at SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes
    Khurshid, Adnan
    Das, Dipankar
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1015 - 1018
  • [38] Team MLab at SemEval-2024 Task 8: Analyzing Encoder Embeddings for Detecting LLM-generated Text
    Li, Kevin
    Hasanaliyev, Kenan
    Zhu, Sally
    Altshuler, George
    Eberts, Alden
    Chen, Eric
    Wang, Kate
    Xia, Emily
    Browne, Eli
    Chen, Ian
    Eren, Umut
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1463 - 1467
  • [39] Groningen Group E at SemEval-2024 Task 8: Detecting machine-generated texts through pre-trained language models augmented with explicit linguistic-stylistic features
    Darwinkel, Patrick
    van Vaals, Sijbren
    van der Holt, Marieke
    van Houten, Jarno
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1006 - 1014
  • [40] iimasNLP at SemEval-2024 Task 8: Unveiling structure-aware language models for automatic generated text identification
    Valdez, Andric
    Gomez-Adorno, Helena
    Marquez, Fernando
    Pantaleon, Jorge
    Bel-Enguix, Gemma
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1110 - 1114