Tree-Based Representation and Generation of Natural and Mathematical Language

被引:0
|
作者
Scarlatos, Alexander [1 ]
Lan, Andrew [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mathematical language in scientific communications and educational scenarios is important yet relatively understudied compared to natural languages. Recent works on mathematical language focus either on representing stand-alone mathematical expressions, especially in their natural tree format, or mathematical reasoning in pre-trained natural language models. Existing works on jointly modeling and generating natural and mathematical languages simply treat mathematical expressions as text, without accounting for the rigid structural properties of mathematical expressions. In this paper, we propose a series of modifications to existing language models to jointly represent and generate text and math: representing mathematical expressions as sequences of node tokens in their operator tree format, using math symbol and tree position embeddings to preserve the semantic and structural properties of mathematical expressions, and using a constrained decoding method to generate mathematically valid expressions. We ground our modifications in GPT-2, resulting in a model MathGPT, and demonstrate that it outperforms baselines on mathematical expression generation tasks.
引用
收藏
页码:3714 / 3730
页数:17
相关论文
共 50 条
  • [41] Natural Language Generation Using Monte Carlo Tree Search
    Kumagai, Kaori
    Kobayashi, Ichiro
    Mochihashi, Daichi
    Asoh, Hideki
    Nakamura, Tomoaki
    Nagai, Takayuki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2018, 22 (05) : 777 - 785
  • [42] An efficient tree-based computation of a metric comparable to a natural diffusion distance
    Goldberg, Maxim J.
    Kim, Seonja
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2012, 33 (02) : 261 - 281
  • [43] Natural mortality estimation using tree-based ensemble learning models
    Liu, Chanjuan
    Zhou, Shijie
    Wang, You-Gan
    Hu, Zhihua
    ICES JOURNAL OF MARINE SCIENCE, 2020, 77 (04) : 1414 - 1426
  • [44] Operation-Based, Fine-Grained Version Control Model for Tree-Based Representation
    Nguyen, Tung Thanh
    Nguyen, Hoan Anh
    Pham, Nam H.
    Nguyen, Tien N.
    FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING, PROCEEDINGS, 2010, 6013 : 74 - 90
  • [45] Tree-based tensor formats
    Falcó A.
    Hackbusch W.
    Nouy A.
    SeMA Journal, 2021, 78 (2) : 159 - 173
  • [46] Tree-based modeling of intonation
    Lee, S
    Oh, YH
    COMPUTER SPEECH AND LANGUAGE, 2001, 15 (01): : 75 - 98
  • [47] Classes of tree-based networks
    Fischer, Mareike
    Galla, Michelle
    Herbst, Lina
    Long, Yangjing
    Wicke, Kristina
    VISUAL COMPUTING FOR INDUSTRY BIOMEDICINE AND ART, 2020, 3 (01)
  • [48] Quantum Tree-Based Planning
    Sequeira, Andre
    Santos, Luis Paulo
    Barbosa, Luis Soares
    IEEE ACCESS, 2021, 9 : 125416 - 125427
  • [49] On Tree-Based Phylogenetic Networks
    Zhang, Louxin
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2016, 23 (07) : 553 - 565
  • [50] Classes of tree-based networks
    Mareike Fischer
    Michelle Galla
    Lina Herbst
    Yangjing Long
    Kristina Wicke
    Visual Computing for Industry, Biomedicine, and Art, 3