Construction of a Probabilistic hierarchical structure based on a Japanese corpus and a Japanese thesaurus

被引:0
|
作者
Terai, Asuka [1 ]
Liu, Bin [2 ]
Nakagawa, Masanori [1 ]
机构
[1] Tokyo Inst Technol, Meguro Ku, 2-12-1 Ookayama, Tokyo 152, Japan
[2] Nissay Informat Technol Co Ltd, Tokyo, Japan
基金
日本学术振兴会;
关键词
D O I
10.1007/978-3-540-78159-2_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of this study is to construct a probabilistic hierarchical structure of categories based on a statistical analysis of Japanese corpus data and to verify the validity of the structure by conducting a psychological experiment. At first, the co-occurrence frequencies of adjectives and nouns within modification relations were extracted from a Japanese corpus. Secondly, a probabilistic hierarchical structure was constructed based on the probability, P (category I noun), representing the category membership of the nouns, and utilizing categorization information in a thesaurus and a soft clustering method (Rose's method [1]) with co-occurrence frequencies as initial values. This method makes it possible to identify the constructed hierarchical structure. In order to examine the validity of the constructed hierarchy, a psychological experiment was conducted. The results of the experiment verified the psychological validity of the hierarchical structure.
引用
收藏
页码:132 / +
页数:3
相关论文
共 50 条
  • [21] A Japanese Word Dependency Corpus
    Mori, Shinsuke
    Ogura, Hideki
    Sasada, Tetsuro
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 753 - 758
  • [22] Construction of a Japanese-Chinese Bilingual Corpus for Learning Japanese Sentence Patterns through Multi-Level Annotation
    Liu, Jun
    Zhuang, Luxuan
    2022 3rd International Conference on Pattern Recognition and Machine Learning, PRML 2022, 2022, : 321 - 327
  • [23] Construction of an Evaluation Corpus for Grammatical Error Correction for Learners of Japanese as a Second Language
    Koyama, Aomi
    Kiyuna, Tomoshige
    Kobayashi, Kenji
    Arai, Mio
    Komachi, Mamoru
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 204 - 211
  • [24] Design and Construction of Japanese Multimodal Utterance Corpus with Improved Emotion Balance and Naturalness
    Horii, Daisuke
    Ito, Akinori
    Nose, Takashi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 245 - 250
  • [25] Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation
    Yamada, Hiroaki
    Teufel, Simone
    Tokunaga, Takenobu
    ARTIFICIAL INTELLIGENCE AND LAW, 2019, 27 (02) : 141 - 170
  • [26] Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation
    Hiroaki Yamada
    Simone Teufel
    Takenobu Tokunaga
    Artificial Intelligence and Law, 2019, 27 : 141 - 170
  • [27] Automatic Assessment of Japanese Text Readability Based on a Textbook Corpus
    Sato, Satoshi
    Matsuyoshi, Suguru
    Kondoh, Yohsuke
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 654 - 660
  • [28] Japanese corpus build based on the technology of computer to work and applications
    Pan, Na
    Yu, Xiao
    Applied Mechanics and Materials, 2014, 543-547 : 3272 - 3275
  • [29] A Japanese Particle Corpus Built by Example-Based Annotation
    Hanaoka, Hiroki
    Mima, Hideki
    Tsujii, Jun'ichi
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1876 - 1880
  • [30] Information-Structure Annotation of the "Balanced Corpus of Contemporary Written Japanese"
    Miyauchi, Takuya
    Asahara, Masayuki
    Nakagawa, Natsuko
    Kato, Sachi
    COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 155 - 165