Information Extraction Challenges in Managing Unstructured Data

被引:0
|
作者
Doan, AnHai [1 ]
Naughton, Jeffrey F. [1 ]
Ramakrishnan, Raghu [1 ]
Baid, Akanksha [1 ]
Chai, Xiaoyong [1 ]
Chen, Fei [1 ]
Chen, Ting [1 ]
Chu, Eric [1 ]
DeRose, Pedro [1 ]
Gao, Byron [1 ]
Gokhale, Chaitanya [1 ]
Huang, Jiansheng [1 ]
Shen, Warren [1 ]
Vuong, Ba-Quy [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the past few years, we have been trying to build an end-to-end system at Wisconsin to manage unstructured data, using extraction, integration, and user interaction. This paper describes the key information extraction (IE) challenges that we have run into, and sketches our solutions. We discuss in particular developing a declarative IE language, optimizing for this language, generating IE provenance, incorporating user feedback into the IE process, developing a novel wiki-based user interface for feedback, best-effort IE, pushing IE into RDBMSs, and more. Our work suggests that IE in managing unstructured data can open up many interesting research challenges, and that these challenges can greatly benefit from the wealth of work on managing structured data that has been carried out by the database community.
引用
收藏
页码:14 / 20
页数:7
相关论文
共 50 条
  • [1] Information extraction challenges in managing unstructured data
    University of Wisconsin-Madison, United States
    SIGMOD Rec., 2008, 4 (14-20):
  • [2] Processing of Unstructured data for Information Extraction
    Ingle, Vaishali A.
    3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [3] Information Extraction from Unstructured Recipe Data
    Silva, Nuno
    Ribeiro, David
    Ferreira, Liliana
    PROCEEDINGS OF THE 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND TECHNOLOGY APPLICATIONS (ICCTA 2019), 2019, : 165 - 168
  • [4] Information Extraction and Visualization of Unstructured Textual Data
    Hashmi, Syed Usama
    Bansal, Ajay
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 142 - 145
  • [5] Information Extraction from Unstructured Data using RDF
    Gandhi, Kalgi
    Madia, Nidhi
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ICT IN BUSINESS INDUSTRY & GOVERNMENT (ICTBIG), 2016,
  • [6] Challenges in Information Retrieval from Unstructured Arabic Data
    Khalil, Hussein
    Osman, Taha
    2014 UKSIM-AMSS 16TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2014, : 456 - 461
  • [7] The Partition Heuristic Information Extraction Algorithm of Unstructured Data
    Li, Cong
    Zou, Chengming
    Zhong, Luo
    Zhu, Jinyang
    2013 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CLOUDCOM-ASIA), 2013, : 570 - 576
  • [8] An analytical study of information extraction from unstructured and multidimensional big data
    Adnan, Kiran
    Akbar, Rehan
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [9] An analytical study of information extraction from unstructured and multidimensional big data
    Kiran Adnan
    Rehan Akbar
    Journal of Big Data, 6
  • [10] Limitations of information extraction methods and techniques for heterogeneous unstructured big data
    Adnan, Kiran
    Akbar, Rehan
    INTERNATIONAL JOURNAL OF ENGINEERING BUSINESS MANAGEMENT, 2019, 11