Automatic Indexing of Financial Documents via Information Extraction

被引:1
|
作者
Ramamurthy, Rajkumar [1 ]
Luebbering, Max [1 ]
Bell, Thiago [1 ]
Gebauer, Michael [2 ]
Ulusay, Bilge [1 ]
Uedelhoven, Daniel [1 ]
Khameneh, Tim Dilmaghani [3 ]
Loitz, Ruediger [3 ]
Pielka, Maren [1 ]
Bauckhage, Christian [1 ]
Sifa, Rafet [1 ]
机构
[1] Fraunhofer IAIS, St Augustin, Germany
[2] TU Berlin, Berlin, Germany
[3] PricewaterhouseCoopers GmbH WPG, Frankfurt, Germany
关键词
Financial Document Classification; Document Processing; Big Data; Information Retrieval; Natural Language Processing;
D O I
10.1109/SSCI50451.2021.9659977
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of extracting information from large volumes of unstructured documents is pervasive in the domain of financial business. Enterprises and investors need automatic methods that can extract information from these documents, particularly for indexing and efficiently retrieving information. To this end, we present a scalable end-to-end document processing system for indexing and information retrieval from large volumes of financial documents. While we show our system works for the use case of financial document processing, the entire system itself is agnostic of the document type and machine learning model type. Thus, it can be applied to any large-scale document processing task involving domain-specific extractors.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Design and implementation of automatic indexing for information retrieval with Arabic documents
    Hmeidi, I
    Kanaan, G
    Evens, M
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1997, 48 (10): : 867 - 881
  • [2] Information extraction and automatic markup for XML documents
    Abolhassani, M
    Fuhr, N
    Gövert, N
    [J]. INTELLIGENT SEARCH ON XML DATA: APPLICATIONS, LANGUAGES, MODELS IMPLEMENTATIONS AND BENCHMARKS, 2003, 2818 : 159 - 174
  • [3] AUTOMATIC INDEXING OF DOCUMENTS AND REQUESTS
    BELONOGOV, GG
    SHEMAKIN, YI
    NOVOSELOV, AP
    CHIRKIN, VA
    RYBAKOV, BP
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 1-ORGANIZATSIYA I METODIKA INFORMATSIONNOI RABOTY, 1973, (07): : 17 - 25
  • [4] AUTOMATIC-INDEXING OF DOCUMENTS FOR INFORMATION-RETRIEVAL SYSTEM DIALOG
    BELONOGOV, GG
    KUZNETSOV, BA
    KRICHEVSKII, VK
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1984, (08): : 10 - 14
  • [5] Automatic selection of table areas in documents for information extraction
    Silva, ACE
    Jorge, A
    Torgo, L
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE-B, 2003, 2902 : 460 - 465
  • [6] COMPLEX METHOD OF AUTOMATIC INDEXING OF DOCUMENTS
    RUBLEV, YV
    VOSTROV, GN
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1973, (04): : 8 - 14
  • [7] RESEARCH IN AUTOMATIC INDEXING OF SCIENTIFIC DOCUMENTS
    GARDIN, JC
    [J]. REVUE FRANCAISE D INFORMATIQUE DE RECHERCHE OPERATIONNELLE, 1967, 1 (06): : 27 - &
  • [8] Documents automatic indexing in an environmental domain
    Bordoni, L
    Pazienza, MT
    [J]. INTERNATIONAL FORUM ON INFORMATION AND DOCUMENTATION, 1997, 22 (01): : 17 - 28
  • [9] Automatic Subject indexing of Chinese documents
    Zhang, SL
    He, Q
    Zheng, Z
    Shi, ZZ
    [J]. Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 256 - 261
  • [10] AUTOMATIC INDEXING OF CONNECTED TEXTS OF RETRIEVAL ANNOTATIONS OF DOCUMENTS FOR SEMANTIC INFORMATION SEARCHING
    PASHCHENKO, NA
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1972, (11): : 38 - 45