Information Processing and Retrieval from CSV File by Natural Language

被引:0
|
作者
Tapsai, Chalermpol [1 ]
机构
[1] Suan Sunandha Rajabhat Univ, Coll Innovat & Management, Bangkok, Thailand
关键词
information processing; retrieval; CSV; natural language; semantic pattern; ontology;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Comma Separated Value (CSV) files are widely used as a fundamental data format. Due to its simple structure and ease of creation, many of the data files that are published in open source and used by organizations are usually stored in CSV files. However, searching for or retrieving expected data from CSV files is quite limited by the traditional keyword-matching technique which can't specify the conditions for searching or processing any data on the search. This paper presents a new model that will allow users to easily retrieve information from CSV files by natural language, a language that users are familiar with and use in everyday life. Users can specify conditions for data retrieval and processing to create the information they need. This will help non-technician users easily retrieve information without the need to learn any additional computer languages or programs. The research data includes natural language messages collected from various sources, both online and offline, to cover on both formal and semi-formal language level. By using natural language processing and techniques such as semantic patterns, ontology, and interactive conversation system, this model can analyze the completeness and meaning of natural language statements as well as allows users to edit the incomplete or faulty statements, and improve the model by adding new words, sentence syntaxes and semantic patterns for more accurate results. Evaluation of the model is performed by 98 testers. By inputting 1,137 natural language statements to the model, the results showed that the models were effective in retrieving and processing data accurately with very high values of precision, recall, and F-score which were all higher than 0.9. There are only 18 statements or 3.2% of all statements that produce errors in the outputs which were caused by the typo in 3 cases: missing of some alphabets which change the word's meaning, using of the ambiguous words, and wrong position of words in the natural language statement.
引用
收藏
页码:212 / 216
页数:5
相关论文
共 50 条
  • [1] Natural language processing and information retrieval
    Voorhees, EM
    [J]. INFORMATION EXTRACTION: TOWARDS SCALABLE, ADAPTABLE SYSTEMS, 1999, 1714 : 32 - 48
  • [2] Natural language processing for information retrieval
    Lewis, DD
    Sparck-Jones, K
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (01) : 92 - 101
  • [3] Application of Natural Language Processing for Information Retrieval
    Xi, Su Mei
    Lee, Dae Jong
    Cho, Young Im
    [J]. PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 18TH '13), 2013, : 621 - 624
  • [4] Application of Natural Language Processing in Information Retrieval
    Rojas, Yenory
    Ferrandez, Antonio
    Peral, Jesus
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (34):
  • [5] Natural Language Processing for Spreadsheet Information Retrieval
    Flood, Derek
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 5723 : 309 - 312
  • [6] Character strings to natural language processing in information retrieval
    Mohd, T
    Sembok, T
    [J]. DIGITAL LIBRARIES: TECHNOLOGY AND MANAGEMENT OF INDIGENOUS KNOWLEDGE FOR GLOBAL ACCESS, 2003, 2911 : 26 - 33
  • [7] NATURAL-LANGUAGE PROCESSING IN INFORMATION-RETRIEVAL
    WARNER, AJ
    [J]. BULLETIN OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1988, 14 (06): : 18 - 19
  • [8] Information retrieval using deep natural language processing
    Setchi, R
    Tang, Q
    Cheng, LX
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2003, 2773 : 879 - 885
  • [9] Applications of Natural Language Processing in the Retrieval of Spanish Information
    Vilares Ferro, Jesus
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2006, (36): : 57 - 58
  • [10] Learning to Rank for Information Retrieval and Natural Language Processing
    Candito, Marie
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2011, 52 (03): : 282 - 285