An augmented semantic search tool for multilingual news analytics

被引:0
|
作者
Harikumar, Sandhya [1 ]
Sathyajit, Rohit [1 ]
Karumudi, Gnana Venkata Naga Sai Kalyan [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amritapuri, India
关键词
News analytics; multilingual; natural language processing(NLP); Latent dirichlet allocation(LDA); semantic information retrieval;
D O I
10.3233/JIFS-221184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
News feeds generate colossal amount of data consisting of important information hidden in the intricacies. State of the art methods are still at infancy in providing a very generic and publicly available solution to skim through the important information in the news from various sources and an ability to search using specific keywords in different languages. This paper focuses on designing a tool to extract semantic details from news articles published through various internet sources in various languages. The semantic information is stored within DBMS for ease of organizing and retrieving the data. Further, a querying facility to search through entire articles based on the keyword or date-based search is also proposed to view the crisp content. The news articles in English, and two Indian languages - Hindi and Malayalam are considered for experimentation. The proposed strategy consists of two main components namely, Generative model creation and Query engine. Generative model aims to extract important entities and keywords along with their relevance to the article and other similar articles using Latent Dirichlet Allocation(LDA) and Named Entity Recognition(NER). Query engine is to facilitate on the fly retrieval of semantic content from the database, based on user keyword. The search engine, along with database indexing, reduces the access time to the database thereby retrieving the information in less time. Experimental results show that the proposed method is effective in terms of quality of information and time consumed for information retrieval.
引用
收藏
页码:8315 / 8327
页数:13
相关论文
共 50 条
  • [1] An End-to-End Tool for News Processing and Semantic Search
    Li, Quanzhi
    Avadhanam, Satish
    Zhang, Qiong
    [J]. WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 139 - 142
  • [2] Financial news semantic search engine
    Lupiani-Ruiz, Eduardo
    Garcia-Manotas, Ignacio
    Valencia-Garcia, Rafael
    Garcia-Sanchez, Francisco
    Castellanos-Nieves, Dagoberto
    Tomas Fernandez-Breis, Jesualdo
    Bosco Camon-Herrero, Juan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (12) : 15565 - 15572
  • [3] Khresmoi - Multilingual Semantic Search of Medical Text and Images
    Aswani, Niraj
    Beckers, Thomas
    Birngruber, Erich
    Boyer, Celia
    Burner, Andreas
    Bystron, Jakub
    Choukri, Khalid
    Cruchet, Sarah
    Cunningham, Hamish
    Dedek, Jan
    Dolamic, Ljiljana
    Donner, Rene
    Dungs, Sebastian
    Eggel, Ivan
    Foncubierta, Antonio
    Fuhr, Norbert
    Funk, Adam
    de Herrera, Alba Garcia Seco
    Gaudinat, Arnaud
    Georgiev, Georgi
    Gobeill, Julien
    Goeuriot, Lorraine
    Gomez, Paz
    Greenwood, Mark
    Gschwandtner, Manfred
    Hanbury, Allan
    Hajic, Jan
    Hlavacova, Jaroslava
    Holzer, Markus
    Jones, Gareth
    Jordan, Blanca
    Jordan, Matthias
    Kaderk, Klemens
    Kainberger, Franz
    Kelly, Liadh
    Kriewel, Sascha
    Kritz, Marlene
    Langs, Georg
    Lawson, Nolan
    Markonis, Dimitrios
    Martinez, Ivan
    Momtchev, Vassil
    Masselot, Alexandre
    Mazo, Helene
    Mueller, Henning
    Palotti, Joao
    Pecina, Pavel
    Pentchev, Konstantin
    Peychev, Deyan
    Pletneva, Natalia
    [J]. MEDINFO 2013: PROCEEDINGS OF THE 14TH WORLD CONGRESS ON MEDICAL AND HEALTH INFORMATICS, PTS 1 AND 2, 2013, 192 : 1266 - 1266
  • [4] A Multilingual Test Collection for the Semantic Search of Entity Categories
    Sales, Juliano Efson
    Barzegar, Siamak
    Franco, Wellington
    Bermeitinger, Bernhard
    Cunha, Tiago
    Davis, Brian
    Freitas, Andre
    Handschuh, Siegfried
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2505 - 2510
  • [5] Building a semantic search tool
    Johnston, OO
    [J]. CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2005, 29 (03): : 376 - 376
  • [6] Multi-document semantic relation extraction for news analytics
    Yongpan Sheng
    Zenglin Xu
    Yafang Wang
    Gerard de Melo
    [J]. World Wide Web, 2020, 23 : 2043 - 2077
  • [7] Multi-document semantic relation extraction for news analytics
    Sheng, Yongpan
    Xu, Zenglin
    Wang, Yafang
    de Melo, Gerard
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (03): : 2043 - 2077
  • [8] GoNTogle: A Tool for Semantic Annotation and Search
    Giannopoulos, Giorgos
    Bikakis, Nikos
    Dalamagas, Theodore
    Sellis, Timos
    [J]. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PT 2, PROCEEDINGS, 2010, 6089 : 376 - +
  • [9] Cubix: A Visual Analytics Tool for Conceptual and Semantic Data
    Melo, Cassio
    Mikheev, Alexander
    Le Grand, Benedicte
    Aufaure, Marie-Aude
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 894 - 897
  • [10] SEMCARE: Multilingual Semantic Search in Semi-Structured Clinical Data
    Lopez-Garcia, Pablo
    Kreuzthaler, Markus
    Schulz, Stefan
    Scherr, Daniel
    Daumke, Philipp
    Marko, Kornel
    Kors, Jan A.
    van Mulligen, Erik M.
    Wang, Xinkai
    Gonna, Hanney
    Behr, Elijah
    Honrado, Angel
    [J]. HEALTH INFORMATICS MEETS EHEALTH, 2016, 223 : 93 - 99