WASM: A Dataset for Hashtag Recommendation for Arabic Tweets

被引:0
|
作者
Al-Shaibani, Maged S. [1 ]
Luqman, Hamzah [1 ,2 ]
Al-Ghofaily, Abdulaziz S. [1 ]
Al-Najim, Abdullatif A. [1 ]
机构
[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia
[2] SDAIA KFUPM Joint Res Ctr Artificial Intelligence, Dhahran 31261, Saudi Arabia
关键词
Hashtag Recommendation; Hashtag Generation; Tweets Classification; Arabic Tweets; Twitter; Hashtags;
D O I
10.1007/s13369-023-08567-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As one of the largest microblogging websites in the world, Twitter generates a huge amount of information daily. The massive size of the generated data increases the difficulty for humans to follow and receive information relevant to their interests. Therefore, Twitter allows users to annotate and categorize their tweets using appropriate hashtags. However, finding an appropriate hashtag for a tweet is not always straightforward. Furthermore, many users violate the hashtag flow by posting irrelevant content to the hashtag topic. These problems increase the need for a hashtag recommendation and classification system. This topic has received considerable attention from researchers in some languages, such as English and Chinese. However, this problem has not yet been explored for the Arabic language owing to the lack of datasets. In this study, we bridge this gap by proposing WASM, an Arabic Twitter hashtag recommendation dataset consisting of more than 100,000 tweets annotated with 87 hashtags. The proposed dataset is subjected to several rounds of automatic and manual filtrations to ensure that it is suitable for tasks related to tweets and hashtags. Further, we propose three systems for hashtag recommendation and classification. Each of these systems approaches the task differently by considering it as classification, generation, and named entity recognition problems. The results obtained using these systems are promising and can be used to benchmark the WASM dataset. The data and code are available at https://github.com/Hamzah-Luqman/wasm.
引用
收藏
页码:12131 / 12145
页数:15
相关论文
共 50 条
  • [11] Interactive Hashtag Recommendation System
    Lin, Chun-Ting
    Li, Tsai-Yen
    [J]. 2022 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, TAAI, 2022, : 165 - 170
  • [12] Hashtag Recommendation Based on User Tweet and Hashtag Classification on Twitter
    Jeon, Mina
    Jun, Sanghoon
    Hwang, Eenjun
    [J]. WEB-AGE INFORMATION MANAGEMENT: WAIM 2014 INTERNATIONAL WORKSHOPS, 2014, 8597 : 325 - 336
  • [13] Graph Summarization for Hashtag Recommendation
    Al-Dhelaan, Mohammed
    Alhawasi, Hadel
    [J]. 2015 3RD INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD) AND INTERNATIONAL CONFERENCE ON OPEN AND BIG (OBD), 2015, : 698 - 702
  • [14] IDRISI-RA: The First Arabic Location Mention Recognition Dataset of Disaster Tweets
    Suwaileh, Reem
    Imran, Muhammad
    Elsayed, Tamer
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 16298 - 16317
  • [15] Hashtag the Tweets: Experimental Evaluation of Semantic Relatedness Measures
    Asif, Muhammad
    Missen, Malik Muhammad Saad
    Akhtar, Nadeem
    Asmat, Hina
    Husnain, Mujtaba
    Asghar, Muhammad
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (06) : 474 - 482
  • [16] Hashtag recommendation for multimodal microblog posts
    Gong, Yeyun
    Zhang, Qi
    Huang, Xuanjing
    [J]. NEUROCOMPUTING, 2018, 272 : 170 - 177
  • [17] Hashtag Recommendation for Photo Sharing Services
    Zhang, Suwei
    Yao, Yuan
    Xu, Feng
    Tong, Hanghang
    Yan, Xiaohui
    Lui, Jian
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5805 - 5812
  • [18] SC-Political ResNet: Hashtag Recommendation from Tweets Using Hybrid Optimization-Based Deep Residual Network
    Banbhrani, Santosh Kumar
    Xu, Bo
    Liu, Haifeng
    Lin, Hongfei
    [J]. INFORMATION, 2021, 12 (10)
  • [19] Sentiment Analysis in Arabic Tweets
    Duwairi, R. M.
    Marji, Raed
    Sha'ban, Narmeen
    Rushaidat, Sally
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [20] Authorship Attribution of Arabic Tweets
    Rabab'ah, Abdullateef
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Aldwairi, Monther
    [J]. 2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,