Disfluency Correction using Unsupervised and Semi-supervised Learning

被引:0
|
作者
Saini, Nikhil [1 ]
Trivedi, Drumil [1 ]
Khare, Shreya [2 ]
Dhamecha, Tejas, I [2 ]
Jyothi, Preethi [1 ]
Bharadwaj, Samarth [2 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol, Mumbai, Maharashtra, India
[2] IBM Res India, Bengaluru, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spoken language is different from the written language in its style and structure. Disfluencies that appear in transcriptions from speech recognition systems generally hamper the performance of downstream NLP tasks. Thus, a disfluency correction system that converts disfluent to fluent text is of great value. This paper introduces a disfluency correction model that translates disfluent to fluent text by drawing inspiration from recent encoder-decoder unsupervised style-transfer models for text. We also show considerable benefits in performance when utilizing a small sample of 500 parallel disfluent-fluent sentences in a semisupervised way. Our unsupervised approach achieves a BLEU score of 79.39 on the Switchboard corpus test set, with further improvement to a BLEU score of 85.28 with semisupervision. Both are comparable to two competitive fully-supervised models.
引用
收藏
页码:3421 / 3427
页数:7
相关论文
共 50 条
  • [31] Semi-supervised learning using differentiable reasoning
    van Krieken, Emile
    Acar, Erman
    van Harmelen, Frank
    [J]. Journal of Applied Logics, 2019, 6 (04): : 633 - 651
  • [32] Using semi-supervised learning for question classification
    Tri, Nguyen Thanh
    Le, Nguyen Minh
    Shimazu, Akira
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 31 - +
  • [33] Image Retrieval Using Semi-Supervised Learning
    Zhu Songhao
    Liang Zhiwei
    [J]. PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 2924 - 2929
  • [34] Semi-Supervised Learning using Adversarial Networks
    Tachibana, Ryosuke
    Matsubara, Takashi
    Uehara, Kuniaki
    [J]. 2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 939 - 944
  • [35] Semi-supervised Learning Using Siamese Networks
    Sahito, Attaullah
    Frank, Eibe
    Pfahringer, Bernhard
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 586 - 597
  • [36] Self-supervised Correction Learning for Semi-supervised Biomedical Image Segmentation
    Zhang, Ruifei
    Liu, Sishuo
    Yu, Yizhou
    Li, Guanbin
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 134 - 144
  • [37] Twice Class Bias Correction for Imbalanced Semi-supervised Learning
    Li, Lan
    Tao, Bowen
    Han, Lu
    Zhan, De-chuan
    Ye, Han-jia
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13563 - 13571
  • [38] Supervised, semi-supervised and unsupervised inference of gene regulatory networks
    Maetschke, Stefan R.
    Madhamshettiwar, Piyush B.
    Davis, Melissa J.
    Ragan, Mark A.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2014, 15 (02) : 195 - 211
  • [39] Learning Semi-Supervised Representation Towards a Unified Optimization Framework for Semi-Supervised Learning
    Li, Chun-Guang
    Lin, Zhouchen
    Zhang, Honggang
    Guo, Jun
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2767 - 2775
  • [40] A Survey on Supervised, Unsupervised, and Semi-Supervised Approaches in Crowd Counting
    Wang, Jianyong
    Gao, Mingliang
    Li, Qilei
    Kim, Hyunbum
    Jeon, Gwanggil
    [J]. Computers, Materials and Continua, 2024, 81 (03): : 3561 - 3582