Identification of Indigenous Knowledge Concepts through Semantic Networks, Spelling Tools and Word Embeddings

被引:0
|
作者
Souza, Renato Rocha [1 ]
Dorn, Amelie [1 ]
Piringer, Barbara [1 ]
Wandl-Vogt, Eveline [1 ]
机构
[1] Austrian Acad Sci, Austrian Ctr Digital Humanities & Cultural Herita, Sonnenfelsgasse 19, Vienna, Austria
基金
爱尔兰科学基金会;
关键词
Digital Humanities; regional languages; lesser-resourced languages; knowledge discovery;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In order to access indigenous, regional knowledge contained in language corpora, semantic tools and network methods are most typically employed. In this paper we present an approach for the identification of dialectal variations of words, or words that do not pertain to High German, on the example of non-standard language legacy collection questionnaires of the Bavarian Dialects in Austria (DBO). Based on selected cultural categories relevant to the wider project context, common words from each of these cultural categories and their lemmas using GermaLemma were identified. Through word embedding models the semantic vicinity of each word was explored, followed by the use of German Wordnet (Germanet) and the Hunspell tool. Whilst none of these tools have a comprehensive coverage of standard German words, they serve as an indication of dialects in specific semantic hierarchies. Methods and tools applied in this study may serve as an example for other similar projects dealing with non-standard or endangered language collections, aiming to access, analyze and ultimately preserve native regional language heritage.
引用
收藏
页码:943 / 947
页数:5
相关论文
共 39 条
  • [1] Improved Learning of Chinese Word Embeddings with Semantic Knowledge
    Yang, Liner
    Sun, Maosong
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015), 2015, 9427 : 15 - 25
  • [2] Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints
    Liu, Quan
    Jiang, Hui
    Wei, Si
    Ling, Zhen-Hua
    Hu, Yu
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 1501 - 1511
  • [3] Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts
    Senel, Lutfi Kerem
    Sahinuc, Furkan
    Yuecesoy, Veysel
    Schuetze, Hinrich
    Cukur, Tolga
    Koc, Aykut
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (03)
  • [4] An Exploration of Semantic Relations in Neural Word Embeddings Using Extrinsic Knowledge
    Chen, Zhiwei
    He, Zhe
    Liu, Xiuwen
    Bian, Jiang
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1246 - 1251
  • [5] Learning Word Embeddings from Portuguese Lexical-Semantic Knowledge Bases
    Oliveira, Hugo Goncalo
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 265 - 271
  • [6] Understanding the Semantic Content of Sparse Word Embeddings Using a Commonsense Knowledge Base
    Balogh, Vanda
    Berend, Gabor
    Diochnos, Dimitrios, I
    Turan, Gyorgy
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7399 - 7406
  • [7] Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks
    Vashishth, Shikhar
    Bhandari, Manik
    Yadav, Prateek
    Rai, Piyush
    Bhattacharyya, Chiranjib
    Talukdar, Partha
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3308 - 3318
  • [8] Injecting Semantic Background Knowledge into Neural Networks using Graph Embeddings
    Ziegler, Konstantin
    Caelen, Olivier
    Garchery, Mathieu
    Granitzer, Michael
    He-Guelton, Liyun
    Jurgovsky, Johannes
    Portier, Pierre-Edouard
    Zwicklbauer, Stefan
    2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 200 - 205
  • [9] Learning short-text semantic similarity with word embeddings and external knowledge sources
    Nguyen, Hien T.
    Duong, Phuc H.
    Cambria, Erik
    KNOWLEDGE-BASED SYSTEMS, 2019, 182
  • [10] Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases
    Zhiwei Chen
    Zhe He
    Xiuwen Liu
    Jiang Bian
    BMC Medical Informatics and Decision Making, 18