Exploring Neural Translation Models for Cross-Lingual Text Similarity

被引:2
|
作者
Seki, Kazuhiro [1 ]
机构
[1] Konan Univ, Kobe, Hyogo, Japan
关键词
Sequence-to-sequence models; distributed representation; cross-lingual information retrieval;
D O I
10.1145/3269206.3269262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper explores a neural network-based approach to computing similarity of two texts written in different languages. Such similarity can be useful for a variety of applications including cross-lingual information retrieval and cross-lingual text classification. To compute similarity, we focus on neural machine translation models and examine the utility of their intermediate states. Through experiments on an English-Japanese translation corpus, it is demonstrated that the intermediate states of input texts are indeed beneficial for computing cross-lingual text similarity, outperforming other approaches including a strong machine translation-based baseline.
引用
收藏
页码:1591 / 1594
页数:4
相关论文
共 50 条
  • [2] Cross-Lingual Text Classification with Model Translation and Document Translation
    Moh, Teng-Sheng
    Zhang, Zhang
    [J]. PROCEEDINGS OF THE 50TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE, 2012,
  • [3] Neural-Network Lexical Translation for Cross-lingual IR from Text and Speech
    Zbib, Rabih
    Zhao, Lingjun
    Karakos, Damianos
    Hartmann, William
    DeYoung, Jay
    Huang, Zhongqiang
    Jiang, Zhuolin
    Rivkin, Noah
    Zhang, Le
    Schwartz, Richard
    Makhoul, John
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 645 - 654
  • [4] Cross-Lingual Document Similarity
    Muhic, Andrej
    Rupnik, Jan
    Skraba, Primoz
    [J]. PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 387 - 392
  • [5] Exploring Cross-Lingual Transfer Learning with Unsupervised Machine Translation
    Wang, Chao
    Gaspers, Judith
    Do, Quynh
    Jiang, Hui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2011 - 2020
  • [6] Cross-Lingual Text Categorization
    Bel, N
    Koster, CHA
    Villegas, M
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, 2003, 2769 : 126 - 139
  • [7] Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
    Kim, Yunsu
    Gao, Yingbo
    Ney, Hermann
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1246 - 1257
  • [8] Cross-lingual Text Classification with Heterogeneous Graph Neural Network
    Wang, Ziyun
    Liu, Xuan
    Yang, Peiji
    Liu, Shixing
    Wang, Zhisheng
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 612 - 620
  • [9] Cross-lingual Supervision Improves Unsupervised Neural Machine Translation
    Wang, Mingxuan
    Bai, Hongxiao
    Zhao, Hai
    Li, Lei
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 89 - 96
  • [10] Cross-lingual Text Classification via Model Translation with Limited Dictionaries
    Xu, Ruochen
    Yang, Yiming
    Liu, Hanxiao
    Hsi, Andrew
    [J]. CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 95 - 104