Speech Processing for Hindi Dialect Recognition

被引:6
|
作者
Sinha, Shweta [1 ]
Jain, Aruna [1 ]
Agrawal, Shyam S. [2 ]
机构
[1] Birla Inst Technol, Ranchi, Bihar, India
[2] KIIT Coll Engn, Gurgaon, India
关键词
Hindi Dialects; spectral features; prosodic features; Feed forward neural networks;
D O I
10.1007/978-3-319-04960-1_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the authors have used 2-layer feed forward neural network for Hindi dialect recognition. A Dialect is a pattern of pronunciation of a language used by a community of native speakers belonging to the same geographical region. In this work, speech features have been explored to recognize four major dialects of Hindi. The dialects under consideration are Khariboli (spoken in West Uttar Pradesh, Delhi and some parts of Uttarakhand and Himachal Pradesh), Bhojpuri (spoken by population of East Uttar Pradesh, Bihar and Jharkhand), Haryanvi (spoken in Haryana, parts of Delhi, Uttar Pradesh and Uttarakhand) and Bagheli (spoken in Central India). Speech corpus for this work is collected from 15 speakers (including both male and female) from each dialect. The syllables of CVC structure is used as processing unit. Spectral features (MFCC) and prosodic features (duration and pitch contour) are extracted from speech for discriminating the dialects. Performance of the system is observed with spectral features and prosodic features as input. Results show that the system performs best when all the spectral and prosodic features are combined together to form input feature set during network training. The dialect recognition system shows a recognition score of 79% with these input features.
引用
收藏
页码:161 / 169
页数:9
相关论文
共 50 条
  • [1] Syllable based Hindi speech recognition
    Bhatt, Shobha
    Jain, Anurag
    Dev, Amita
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1333 - 1351
  • [2] ACOUSTIC-PHONETIC FEATURE BASED DIALECT IDENTIFICATION IN HINDI SPEECH
    Sinha, Shweta
    Jain, Aruna
    Agrawal, S. S.
    [J]. INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2015, 8 (01): : 235 - 254
  • [3] Weighted Transformer for Dialect Speech Recognition
    Zhang, Minghan
    Xie, Fei
    Weng, Fuliang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG), 2022, : 381 - 385
  • [4] Tibetan Multi-Dialect Speech and Dialect Identity Recognition
    Zhao, Yue
    Yue, Jianjian
    Song, Wei
    Xu, Xiaona
    Li, Xiali
    Wu, Licheng
    Ji, Qiang
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (03): : 1223 - 1235
  • [5] Discriminative Techniques for Hindi Speech Recognition System
    Aggarwal, Rajesh Kumar
    Dave, Mayank
    [J]. INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 261 - 266
  • [6] Automatic Estimation of Dialect Mixing Ratio for Dialect Speech Recognition
    Hirayama, Naoki
    Yoshino, Koichiro
    Itoyama, Katsutoshi
    Mori, Shinsuke
    Okuno, Hiroshi G.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1491 - 1495
  • [7] Chinese dialect speech recognition: a comprehensive survey
    Qiang Li
    Qianyu Mai
    Mandou Wang
    Mingjuan Ma
    [J]. Artificial Intelligence Review, 57
  • [8] Automatic speech recognition system for Tunisian dialect
    Abir Masmoudi
    Fethi Bougares
    Mariem Ellouze
    Yannick Estève
    Lamia Belguith
    [J]. Language Resources and Evaluation, 2018, 52 : 249 - 267
  • [9] Speech Emotion Recognition Based on Henan Dialect
    Cheng, Zichen
    Li, Yan
    Jiu, Mengfei
    Ge, Jiangwei
    [J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, VOL. 1, 2022, 878 : 498 - 505
  • [10] Automatic speech recognition system for Tunisian dialect
    Masmoudi, Abir
    Bougares, Fethi
    Ellouze, Mariem
    Esteve, Yannick
    Belguith, Lamia
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (01) : 249 - 267