"Easy" meta-embedding for detecting and correcting semantic errors in Arabic documents

被引:2
|
作者
Zribi, Chiraz Ben Othmane [1 ]
机构
[1] Manouba Univ, ENSI, La Manouba 2010, Tunisia
关键词
Detection-correction; Real-word error; Semantic inconsistency; Meta-embedding; Collocation; SkipGram; FastText; BERT; TEXTS;
D O I
10.1007/s11042-023-14553-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word-Embedding models have enabled massive advances in natural language understanding tasks and achieved state-of-the-art performances in multiple natural language processing tasks. In this paper, we present an original method based on an "easy" meta-embedding to automatically detect and correct Arabic real-words errors that are semantically inconsistent with the context of the sentence. Due to the lexical proximity of words in Arabic, the risk of having this type of errors in documents is relatively high compared to other languages. Our method uses three word embedding techniques and their combination, namely SkipGram, FastText and BERT for both detection and correction. It checks the semantic affinity of words with the immediate context in a collocation and the near context of the sentence. Experiments have shown that the proposed meta-embedding improves the overall performance of our system.
引用
收藏
页码:21161 / 21175
页数:15
相关论文
共 6 条
  • [1] “Easy” meta-embedding for detecting and correcting semantic errors in Arabic documents
    Chiraz Ben Othmane Zribi
    [J]. Multimedia Tools and Applications, 2023, 82 : 21161 - 21175
  • [2] Combining methods for detecting and correcting semantic hidden errors in Arabic texts
    Zribi, Chiraz Ben Othmane
    Mejri, Hanene
    Ahmed, Mohamed Ben
    [J]. Computational Linguistics and Intelligent Text Processing, 2007, 4394 : 634 - 645
  • [3] Domain-specific meta-embedding with latent semantic structures
    Liu, Qian
    Lu, Jie
    Zhang, Guangquan
    Shen, Tao
    Zhang, Zhihan
    Huang, Heyan
    [J]. INFORMATION SCIENCES, 2021, 555 : 410 - 423
  • [4] SEMANTIC STRINGS - A NEW TECHNIQUE FOR DETECTING AND CORRECTING USER ERRORS
    BRADFORD, JH
    [J]. INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1990, 33 (04): : 399 - 407
  • [5] A novel approach for detecting and correcting segmentation and recognition errors in Arabic OCR systems
    Mostafa, K
    Shaheen, SI
    Darwish, AM
    Farag, I
    [J]. MULTIPLE APPROACHES TO INTELLIGENT SYSTEMS, PROCEEDINGS, 1999, 1611 : 530 - 539
  • [6] Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding
    Yang, Jinhai
    Yang, Hua
    Chen, Lin
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3005 - 3014