MYCanCor: A Video Corpus of spoken Malaysian Cantonese

被引:0
|
作者
Liesenfeld, Andreas [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
关键词
Malaysian Cantonese; spoken corpora; naturally-occurring talk-in-interaction;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Malaysia Cantonese Corpus (MYCanCor) is a collection of recordings of Malaysian Cantonese speech mainly collected in Perak, Malaysia. The corpus consists of around 20 hours of video recordings of spontaneous talk-in-interaction (56 settings) typically involving 2-4 speakers. A short scene description as well as basic speaker information is provided for each recording. The corpus is transcribed in CHAT (minCHAT) format and presented in traditional Chinese characters (UTF8) using the Hong Kong Supplementary Character Set (HKSCS). MYCanCor is expected to be a useful resource for researchers interested in any aspect of spoken language processing or Chinese multimodal corpora.
引用
收藏
页码:764 / 767
页数:4
相关论文
共 50 条
  • [1] Automatic Word Segmentation for Spoken Cantonese
    Fung, Roxana
    Bigi, Brigitte
    2015 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2015 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2015, : 196 - 201
  • [2] Language Modeling for Speech Recognition of Spoken Cantonese
    Yeung, Yu Ting
    Cao, Houwei
    Zheng, N. H.
    Lee, Tan
    Ching, P. C.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1570 - 1573
  • [3] Spoken language resources for Cantonese speech processing
    Lee, T
    Lo, WK
    Ching, PC
    Meng, H
    SPEECH COMMUNICATION, 2002, 36 (3-4) : 327 - 342
  • [4] Development of a Cantonese Dysarthric Speech Corpus
    Wong, Ka Ho
    Yeung, Yu Ting
    Chan, Edwin H. Y.
    Wong, Patrick C. M.
    Levow, Gina-Anne
    Meng, Helen
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 329 - 333
  • [5] The Wenzhou Spoken Corpus
    Newman, John
    Lin, Jingxia
    Butler, Terry
    Zhang, Eric
    CORPORA, 2007, 2 (01) : 97 - 109
  • [6] Phonological priming in Cantonese spoken-word processing
    Yip, MCW
    PSYCHOLOGIA, 2001, 44 (03) : 223 - 229
  • [7] Enriching Linguistic Representation in the Cantonese Wordnet and Building the New Cantonese Wordnet Corpus
    Sio, Joanna Ut-Seong
    da Costa, Luis Morgado
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 70 - 78
  • [8] Lexical tone in Cantonese spoken-word processing
    Cutler, A
    Chen, HC
    PERCEPTION & PSYCHOPHYSICS, 1997, 59 (02): : 165 - 179
  • [9] Lexical tone in Cantonese spoken-word processing
    Anne Cutler
    Hsuan-Chih Chen
    Perception & Psychophysics, 1997, 59 : 165 - 179
  • [10] Compounds, competition, and incremental word identification in spoken Cantonese
    Tsang, Cara
    Chambers, Craig G.
    Mozuraitis, Mindaugas
    LANGUAGE COGNITION AND NEUROSCIENCE, 2017, 32 (01) : 69 - 81