Constructing an Enriched Domain Taxonomy for Hindi using Word Embeddings

被引:0
|
作者
Keshava, Vaishakh [1 ]
Avvaru, Pravalika [2 ]
Kamath, Sowmya S. [3 ]
Geetha, V [3 ]
机构
[1] Intuit Inc, Bengaluru, India
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[3] Natl Inst Technol Karnataka, Dept Informat Technol, Surathkal, India
关键词
Asian language processing; taxonomies; semantic processing; natural language processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain-specific taxonomies constitute a valuable resource as they offer extensive support in information retrieval related activities like browsing, searching, recommendations and personalization. Such taxonomies can bridge the gap between the lack of domain-specific querying knowledge in potential users and the actual content. In case of multilingual content, taxonomies can play a pivotal role in boosting search performance for content across language barriers. In this paper, a domain-agnostic framework for building an evolving, domain-specific taxonomy for the Hindi, given a set of well-organized data points is proposed. The approach is intended for designing a hierarchical taxonomy enriched with synonyms and other morphological variants using WordNet and Word2vec models respectively. The hierarchical structure acts as a base which binds the taxonomy to a given domain. Such enrichment can improve taxonomy coverage within the given domain The focus is also on building a taxonomy that can self-evolve over time, with high precision and recall, with minimal manual effort.
引用
收藏
页码:127 / 130
页数:4
相关论文
共 50 条
  • [1] A supervised approach to taxonomy extraction using word embeddings
    Sarkar, Rajdeep
    McCrae, John P.
    Buitelaar, Paul
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2059 - 2064
  • [2] INTENT DETECTION USING SEMANTICALLY ENRICHED WORD EMBEDDINGS
    Kim, Joo-Kyung
    Tur, Gokhan
    Celikyilmaz, Asli
    Cao, Bin
    Wang, Ye-Yi
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 414 - 419
  • [3] Domain Ontology Induction using Word Embeddings
    Gupta, Niharika
    Podder, Sanjay
    Annervaz, K. M.
    Sengupta, Shubhashis
    [J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 115 - 119
  • [4] Domain Adaptation for Word Sense Disambiguation Using Word Embeddings
    Komiya, Kanako
    Suzuki, Shota
    Sasaki, Minoru
    Shinnou, Hiroyuki
    Okumura, Manabu
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 195 - 206
  • [5] Adaptive GloVe and FastText Model for Hindi Word Embeddings
    Gaikwad, Vijay
    Haribhakta, Yashodhara
    [J]. PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 175 - 179
  • [6] Large-scale Taxonomy Induction Using Entity and Word Embeddings
    Ristoski, Petar
    Faralli, Stefano
    Ponzetto, Simone Paolo
    Paulheim, Heiko
    [J]. 2017 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2017), 2017, : 81 - 87
  • [7] Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval
    Bhattacharya, Paheli
    Goyal, Pawan
    Sarkar, Sudeshna
    [J]. COMPUTACION Y SISTEMAS, 2016, 20 (03): : 435 - 447
  • [8] Emotion-enriched word embeddings for Turkish
    Uymaz, Hande Aka
    Metin, Senem Kumova
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 225
  • [9] Text classification with semantically enriched word embeddings
    Pittaras, N.
    Giannakopoulos, G.
    Papadakis, G.
    Karkaletsis, V
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (04) : 391 - 425
  • [10] A method for constructing word sense embeddings based on word sense induction
    Yujia Sun
    Jan Platoš
    [J]. Scientific Reports, 13