Automatic Estimation of Presentation Skills Using Speech, Slides and Gestures

被引:3
|
作者
Hanani, Abualsoud [1 ]
Al-Amleh, Mohammad [1 ]
Bazbus, Waseem [1 ]
Salameh, Saleem [1 ]
机构
[1] Birzeit Univ, Birzeit, Palestine
来源
关键词
Presentation skills; Audio features; Gesture; Slides features; Multi-Modality;
D O I
10.1007/978-3-319-66429-3_17
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes an automatic system which uses multimodal techniques for automatically estimating oral presentation skills. It is based on a set of features from three sources; audio, gesture and power-point slides. Machine learning techniques are used to classify each presentation into two classes (high vs. low) and into three classes; low, average, and high-quality presentation. Around 448 Multimodal recordings of the MLA'14 dataset were used for training and evaluating three different 2-class and 3-class classifiers. Classifiers were evaluated for each feature type independently and for all features combined together. The best accuracy of the 2-class systems is 90.1% achieved by SVM trained on audio features and 75% for 3-class systems achieved by random forest trained on slides features. Combining three feature types into one vector improves all systems accuracy by around 5%.
引用
收藏
页码:182 / 191
页数:10
相关论文
共 50 条
  • [31] TranscRater: a Tool for Automatic Speech Recognition Quality Estimation
    Jalalvand, Shahab
    Negri, Matteo
    Turchi, Marco
    de Souza, Jose G. C.
    Falavigna, Daniele
    Qwaider, Mohammed R. H.
    [J]. PROCEEDINGS OF 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-2016): SYSTEM DEMONSTRATIONS, 2016, : 43 - 48
  • [32] Automatic speaker verification from affective speech using Gaussian mixture model based estimation of neutral speech characteristics
    Avila, Anderson R.
    O'Shaughnessy, Douglas
    Falk, Tiago H.
    [J]. SPEECH COMMUNICATION, 2021, 132 : 21 - 31
  • [33] AN ASSESSMENT OF AUTOMATIC SPEECH RECOGNITION AS SPEECH INTELLIGIBILITY ESTIMATION IN THE CONTEXT OF ADDITIVE NOISE
    Liu, Wei M.
    Mason, John S. D.
    Evans, Nicholas W. D.
    Jellyman, Keith A.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2166 - 2169
  • [34] Direct estimation of hand gestures using color combinations
    Yoshino, K
    Kawashima, T
    Aoki, Y
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1997, 80 (05): : 1 - 8
  • [35] A potential contribution of virtual slides and automatic image processing to the estimation of the vascularization of cranial sutures
    Dorandeu, Anne
    Plancoulaine, Benoit
    Leonetti, Georges
    Herlin, Paulette
    [J]. ANNALES DE PATHOLOGIE, 2008, 28 (04) : 342 - 346
  • [36] SYNCHRONIZATION OF PRESENTATION SLIDES AND LECTURE VIDEOS USING BIT RATE SEQUENCES
    Schroth, G.
    Cheung, N-M
    Steinbach, E.
    Girod, B.
    [J]. 2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 925 - 928
  • [37] Automatic detection of abnormal hand gestures in patients with radial, ulnar, or median nerve injury using hand pose estimation
    Gu, Fanbin
    Fan, Jingyuan
    Cai, Chengfeng
    Wang, Zhaoyang
    Liu, Xiaolin
    Yang, Jiantao
    Zhu, Qingtang
    [J]. FRONTIERS IN NEUROLOGY, 2022, 13
  • [38] Does my Speech Rock? Automatic Assessment of Public Speaking Skills
    Azais, Lucas
    Payan, Adrien
    Sun, Tianjiao
    Vidal, Guillaume
    Zhang, Tina
    Coutinho, Eduardo
    Eyben, Florian
    Schuller, Bjoern
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2519 - 2523
  • [39] Automatic segmentation of histopathological slides of renal tissue using deep learning
    de Bel, Thomas
    Hermsen, Meyke
    Smeets, Bart
    Hilbrands, Luuk
    van der Laak, Jeroen
    Litjens, Geert
    [J]. MEDICAL IMAGING 2018: DIGITAL PATHOLOGY, 2018, 10581
  • [40] ANALYSIS-BY-SYNTHESIS FEATURE ESTIMATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION USING SPECTRAL MASKS
    Mandel, Michael I.
    Narayanan, Arun
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,