New/s/leak 2.0-Multilingual Information Extraction and Visualization for Investigative Journalism

被引:3
|
作者
Wiedemann, Gregor [1 ]
Yimam, Seid Muhie [1 ]
Biemann, Chris [1 ]
机构
[1] Univ Hamburg, MIN Fac, Dept Informat, Language Technol Grp, Hamburg, Germany
来源
关键词
Information extraction; Investigative journalism; Data journalism; Named entity recognition; Keyterm extraction;
D O I
10.1007/978-3-030-01159-8_30
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Investigative journalism in recent years is confronted with two major challenges: (1) vast amounts of unstructured data originating from large text collections such as leaks or answers to Freedom of Information requests, and (2) multi-lingual data due to intensified global cooperation and communication in politics, business and civil society. Faced with these challenges, journalists are increasingly cooperating in international networks. To support such collaborations, we present the new version of new/s/leak 2.0, our open-source software for content-based searching of leaks. It includes three novel main features: (1) automatic language detection and language-dependent information extraction for 40 languages, (2) entity and keyword visualization for efficient exploration, and (3) decentral deployment for analysis of confidential data from various formats. We illustrate the new analysis capabilities with an exemplary case study.
引用
收藏
页码:313 / 322
页数:10
相关论文
共 17 条
  • [1] A Multilingual Information Extraction Pipeline for Investigative Journalism
    Wiedemann, Gregor
    Yimam, Seid Muhie
    Biemann, Chris
    [J]. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 78 - 83
  • [2] new/sleak - Information Extraction and Visualization for Investigative Data Journalists
    Yimam, Seid Muhie
    Ulrich, Heiner
    von Landesberger, Tatiana
    Rosenbach, Marcel
    Regneri, Michaela
    Panchenko, Alexander
    Lehmann, Franziska
    Fahrer, Uli
    Biemann, Chris
    Ballweg, Kathrin
    [J]. PROCEEDINGS OF 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-2016): SYSTEM DEMONSTRATIONS, 2016, : 163 - 168
  • [3] This wheel's on fire: New models for investigative journalism
    Morton, Tom
    [J]. PACIFIC JOURNALISM REVIEW, 2012, 18 (01): : 13 - 16
  • [5] Valentin Rasputin's Journalism in the Language of the Museum 2.0: A New Translation of Meanings
    Romantsova, Tatyana D.
    [J]. THEORETICAL AND PRACTICAL ISSUES OF JOURNALISM, 2024, 13 (02): : 373 - 390
  • [6] Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili
    Steinberger, Ralf
    Ombuya, Sylvia
    Kabadjov, Mijail
    Pouliquen, Bruno
    Della Rocca, Leo
    Belyaeva, Jenya
    de Paola, Monica
    Ignat, Camelia
    van der Goot, Erik
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2011, 45 (03) : 311 - 330
  • [7] Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili
    Ralf Steinberger
    Sylvia Ombuya
    Mijail Kabadjov
    Bruno Pouliquen
    Leo Della Rocca
    Jenya Belyaeva
    Monica de Paola
    Camelia Ignat
    Erik van der Goot
    [J]. Language Resources and Evaluation, 2011, 45 : 311 - 330
  • [8] Building New Media's Science Information on the Pillars of Journalism
    Leary, Warren
    [J]. TAKING SCIENCE TO THE PEOPLE: A COMMUNICATION PRIMER FOR SCIENTISTS AND ENGINEERS, 2010, : 61 - 67
  • [9] DILIA - A DIGITAL LIBRARY ASSISTANT A New Approach to Information Discovery through Information Extraction and Visualization
    Seifert, Inessa
    Eichler, Kathrin
    Hemsen, Holmer
    Schmeier, Sven
    Kruppa, Michael
    Reithinger, Norbert
    Neumann, Guenter
    [J]. KMIS 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE MANAGEMENT AND INFORMATION SHARING, 2009, : 180 - 185
  • [10] EmotiBlog: A Model to Learn Subjective Information Detection in the New Textual Genres of the Web 2.0 -a Multilingual and MultiGenre Approach-
    Boldrini, Ester
    Martinez-Barco, Patricio
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (48): : 131 - 134