Story Trees: Representing Documents using Topological Persistence

被引:0
|
作者
Haghighatkhah, Pantea [1 ]
Fokkens, Antske [1 ,2 ]
Sommerauer, Pia [2 ]
Speckmann, Bettina [1 ]
Verbeek, Kevin [1 ]
机构
[1] Eindhoven Univ Technol, Dept Math & Comp Sci, Eindhoven, Netherlands
[2] Vrije Univ Amsterdam, Computat Linguist & Text Min Lab, Amsterdam, Netherlands
关键词
Topical Data Analysis; Semantic Vectors; Document level discourse;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Topological Data Analysis (TDA) focuses on the inherent shape of (spatial) data. As such, it may provide useful methods to explore spatial representations of linguistic data (embeddings) which have become central in NLP. In this paper we aim to introduce TDA to researchers in language technology. We use TDA to represent document structure as so-called story trees. Story trees are hierarchical representations created from semantic vector representations of sentences via persistent homology. They can be used to identify and clearly visualize prominent components of a story line. We showcase their potential by using story trees to create extractive summaries for news stories.
引用
收藏
页码:2413 / 2429
页数:17
相关论文
共 50 条
  • [1] Representing versions in XML documents using versionstamp
    Arevalo Rosado, Luis Jesus
    Polo Marquez, Antonio
    Fernandez Gonzalez, Juan Ma
    ADVANCES IN CONCEPTUAL MODELING - THEORY AND PRACTICE, PROCEEDINGS, 2006, 4231 : 257 - +
  • [2] REPRESENTING DOCUMENTS USING AN EXPLICIT MODEL OF THEIR SIMILARITIES
    BARTELL, BT
    COTTRELL, GW
    BELEW, RK
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1995, 46 (04): : 254 - 271
  • [3] Representing changes in XML documents using dimensions
    Gergatsoulis, M
    Stavrakas, Y
    DATABASE AND XML TECHNOLOGIES, 2003, 2824 : 208 - 222
  • [4] Representing part families using decision trees
    Dasari, RV
    Moon, YB
    FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING 1996, 1996, : 631 - 640
  • [5] Representing trees using Microsoft doughnut charts
    Bordley, RF
    AMERICAN STATISTICIAN, 2002, 56 (02): : 139 - 147
  • [6] Image segmentation using topological persistence
    Letscher, David
    Fritts, Jason
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2007, 4673 : 587 - 595
  • [7] Representing Boolean formulas by using trees of implicants and implicates
    Gutiérrez, G
    de Guzmán, IP
    Martínez, J
    Ojeda-Aciego, M
    Valverde, A
    PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : 551 - 554
  • [8] Representing topological structures using cell-chains
    Cardoze, David E.
    Miller, Gary L.
    Phillips, Todd
    GEOMETRIC MODELING AND PROCESSING - GMP 2006, PROCEEDINGS, 2006, 4077 : 248 - 266
  • [9] Representing parameterised fault trees using Bayesian networks
    Marsh, William
    Bearfield, George
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, PROCEEDINGS, 2007, 4680 : 120 - +
  • [10] Representing trees with constraints
    Curry, B
    Wiggins, GA
    Hayes, G
    COMPUTATIONAL LOGIC - CL 2000, 2000, 1861 : 315 - 325