Towards Multilingual Automated Classification Systems

被引:2
|
作者
Musaev, Aibek [1 ]
Pu, Calton [2 ]
机构
[1] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
[2] Georgia Inst Technol, Sch Comp Sci, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICDCS.2017.208
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we propose and evaluate three approaches for automated classification of texts in over 60 languages without the need for a manually annotated dataset in those languages. All approaches are based on the randomized Explicit Semantic Analysis method using multilingual Wikipedia articles as their knowledge repository. We evaluate the proposed approaches by classifying a Twitter dataset in English and Portuguese into relevant and irrelevant items with respect to landslide as a natural disaster, where the highest achieved F1-score is 0.93. These approaches can be used in various applications where multilingual classification is needed, including multilingual disaster reporting using Social Media to improve coverage and increase confidence. As illustration, we present a demonstration that combines data from physical sensors and social networks to detect landslide events reported in English and Portuguese.
引用
收藏
页码:2333 / 2337
页数:5
相关论文
共 50 条
  • [21] Towards an Automated Classification Approach for Software Engineering Research
    Kaplan, Angelika
    Keim, Jan
    PROCEEDINGS OF EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING (EASE 2021), 2021, : 347 - 352
  • [22] Towards automated segmentation and classification of masses in digital mammograms
    Ball, JE
    Butler, TW
    Bruce, LM
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 1814 - 1817
  • [23] Towards Automated Classification of Intensive Care Nursing Narratives
    Hiissa, Marketta
    Pahikkala, Tapio
    Suominen, Hanna
    Lehtikunnas, Tuija
    Back, Barbro
    Karsten, Helena
    Salantera, Sanna
    Salakoski, Tapio
    UBIQUITY: TECHNOLOGIES FOR BETTER HEALTH IN AGING SOCIETIES, 2006, 124 : 789 - +
  • [24] Towards the Development of an Automated Blood Vessel Classification System
    Satney, W.
    Als, A.
    Carrington-Dyall, A.
    Scantlebury-Manning, T.
    2016 10TH INTERNATIONAL SYMPOSIUM ON COMMUNICATION SYSTEMS, NETWORKS AND DIGITAL SIGNAL PROCESSING (CSNDSP), 2016,
  • [25] The "Automated Multilingual Resume Builder"
    Terrier, Linda
    Sirdey, Christine Vaillant
    Arino, Mathilde
    RECHERCHE ET PRATIQUES PEDAGOGIQUES EN LANGUES DE SPECIALITE-CAHIERS DE L APLIUT, 2012, 31 (01): : 97 - 117
  • [26] Multilingual Documentation and Classification
    Donnelly, Kevin
    EHEALTH: COMBINING HEALTH TELEMATICS, TELEMEDICINE, BIOMEDICAL ENGINEERING AND BIOINFORMATICS TO THE EDGE: GLOBAL EXPERTS SUMMIT TEXTBOOK, 2008, 134 : 235 - 243
  • [27] Data generation approaches for topic classification in multilingual spoken dialog systems
    Montenegro, C.
    Santana, R.
    Lozano, J. A.
    12TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2019), 2019, : 211 - 217
  • [28] Multilingual Question Answering Systems: Question Classification in Spanish based in Learning
    Garcia Cumbreras, Miguel Angel
    Martinez Santiago, Fernando
    Alfonso Urena Lopez, L.
    Montejo Raez, Arturo
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (34):
  • [29] AUTOMATED SYSTEMS FOR ACCESS TO MULTILINGUAL AND MULTISCRIPT LIBRARY-MATERIALS - PROBLEMS AND SOLUTIONS
    不详
    IFLA JOURNAL-INTERNATIONAL FEDERATION OF LIBRARY ASSOCIATIONS, 1987, 13 (01): : 70 - 70
  • [30] Towards a classification of natural integrable systems
    Tsiganov, A. V.
    REGULAR & CHAOTIC DYNAMICS, 2006, 11 (03): : 343 - 362