WEB-DERIVED PRONUNCIATIONS

被引:5
|
作者
Ghoshal, Arnab [1 ]
Jansche, Martin [2 ]
Khudanpur, Sanjeev [1 ]
Riley, Michael [2 ]
Ulinski, Morgan [3 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Google Inc, New York, NY 10011 USA
[3] Cornell Univ, Ithaca, NY 14853 USA
关键词
Speech processing;
D O I
10.1109/ICASSP.2009.4960577
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Pronunciation information is available in large quantities on the Web, in the form of IPA and ad-hoc transcriptions. We describe techniques for extracting candidate pronunciations from Web pages and associating them with orthographic words, filtering out poorly extracted pronunciations, normalizing IPA pronunciations to better conform to a common transcription standard, and generating phonemic from ad-hoc transcriptions. We show improvements on a letter-to-phoneme task when using web-derived vs. Pronlex pronunciations.
引用
收藏
页码:4289 / +
页数:2
相关论文
共 50 条
  • [1] Efficient Clustering of Web-Derived Data Sets
    Sarmento, Luis
    Kehlenbeck, Alexander
    Oliveira, Eugenio
    Ungar, Lyle
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 398 - +
  • [2] Web Derived Pronunciations for Spoken Term Detection
    Can, Dogan
    Cooper, Erica
    Ghoshal, Arnab
    Jansche, Martin
    Khudanpur, Sanjeev
    Ramabhadran, Bhuvana
    Riley, Michael
    Saraclar, Murat
    Sethy, Abhinav
    Ulinski, Morgan
    White, Christopher
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 83 - 90
  • [3] Detecting outdated POI relations via web-derived features
    Chuang, Hsiu-Min
    Chang, Chia-Hui
    Lee, Wang-Chien
    [J]. TRANSACTIONS IN GIS, 2018, 22 (05) : 1238 - 1256
  • [4] Web-Derived Resources for Web Information Retrieval: From Conceptual Hierarchies to Attribute Hierarchies
    Pasca, Marius
    Alfonseca, Enrique
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 596 - 603
  • [5] Web-derived Emotional Word Detection in Social Media Using Latent Semantic Information
    Cai, Chiyu
    Li, Linjing
    Zeng, Daniel
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 95 - 100
  • [6] Utilizing lexical data from a Web-derived corpus to expand productive collocation knowledge
    Wu, Shaoqun
    Witten, Ian H.
    Franken, Margaret
    [J]. RECALL, 2010, 22 : 83 - 102
  • [7] INDIAN ENGLISH - AN EMERGING EPICENTRE? A PILOT STUDY ON LIGHT VERBS IN WEB-DERIVED CORPORA OF SOUTH ASIAN ENGLISHES
    Hoffmann, Sebastian
    Hundt, Marianne
    Mukherjee, Joybrato
    [J]. ANGLIA-ZEITSCHRIFT FUR ENGLISCHE PHILOLOGIE, 2011, 129 (3-4): : 258 - 280
  • [8] PRONUNCIATIONS
    BURR, AF
    [J]. PHYSICS TEACHER, 1983, 21 (06): : 406 - 406
  • [9] Pronunciations in Diaspora
    Worra, Bryan Thao
    [J]. CANADIAN LITERATURE, 2020, (242): : 123 - 125
  • [10] NAME PRONUNCIATIONS
    CROOKS, HM
    [J]. CHEMICAL & ENGINEERING NEWS, 1967, 45 (21) : 8 - &