Automatic Estimation of Presentation Skills Using Speech, Slides and Gestures

被引：3

作者：

Hanani, Abualsoud ^{[1
]}

Al-Amleh, Mohammad ^{[1
]}

Bazbus, Waseem ^{[1
]}

Salameh, Saleem ^{[1
]}

机构：

[1] Birzeit Univ, Birzeit, Palestine

来源：

SPEECH AND COMPUTER, SPECOM 2017 | 2017年 / 10458卷

关键词：

Presentation skills; Audio features; Gesture; Slides features; Multi-Modality;

D O I：

10.1007/978-3-319-66429-3_17

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes an automatic system which uses multimodal techniques for automatically estimating oral presentation skills. It is based on a set of features from three sources; audio, gesture and power-point slides. Machine learning techniques are used to classify each presentation into two classes (high vs. low) and into three classes; low, average, and high-quality presentation. Around 448 Multimodal recordings of the MLA'14 dataset were used for training and evaluating three different 2-class and 3-class classifiers. Classifiers were evaluated for each feature type independently and for all features combined together. The best accuracy of the 2-class systems is 90.1% achieved by SVM trained on audio features and 75% for 3-class systems achieved by random forest trained on slides features. Combining three feature types into one vector improves all systems accuracy by around 5%.

引用

页码：182 / 191

页数：10

共 50 条

[31] TranscRater: a Tool for Automatic Speech Recognition Quality Estimation
Jalalvand, Shahab
Negri, Matteo
Turchi, Marco
de Souza, Jose G. C.
Falavigna, Daniele
Qwaider, Mohammed R. H.
[J]. PROCEEDINGS OF 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-2016): SYSTEM DEMONSTRATIONS, 2016, : 43 - 48
[32] Automatic speaker verification from affective speech using Gaussian mixture model based estimation of neutral speech characteristics
Avila, Anderson R.
O'Shaughnessy, Douglas
Falk, Tiago H.
[J]. SPEECH COMMUNICATION, 2021, 132 : 21 - 31
[33] AN ASSESSMENT OF AUTOMATIC SPEECH RECOGNITION AS SPEECH INTELLIGIBILITY ESTIMATION IN THE CONTEXT OF ADDITIVE NOISE
Liu, Wei M.
Mason, John S. D.
Evans, Nicholas W. D.
Jellyman, Keith A.
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2166 - 2169
[34] Direct estimation of hand gestures using color combinations
Yoshino, K
Kawashima, T
Aoki, Y
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1997, 80 (05): : 1 - 8
[35] A potential contribution of virtual slides and automatic image processing to the estimation of the vascularization of cranial sutures
Dorandeu, Anne
Plancoulaine, Benoit
Leonetti, Georges
Herlin, Paulette
[J]. ANNALES DE PATHOLOGIE, 2008, 28 (04) : 342 - 346
[36] SYNCHRONIZATION OF PRESENTATION SLIDES AND LECTURE VIDEOS USING BIT RATE SEQUENCES
Schroth, G.
Cheung, N-M
Steinbach, E.
Girod, B.
[J]. 2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 925 - 928
[37] Automatic detection of abnormal hand gestures in patients with radial, ulnar, or median nerve injury using hand pose estimation
Gu, Fanbin
Fan, Jingyuan
Cai, Chengfeng
Wang, Zhaoyang
Liu, Xiaolin
Yang, Jiantao
Zhu, Qingtang
[J]. FRONTIERS IN NEUROLOGY, 2022, 13
[38] Does my Speech Rock? Automatic Assessment of Public Speaking Skills
Azais, Lucas
Payan, Adrien
Sun, Tianjiao
Vidal, Guillaume
Zhang, Tina
Coutinho, Eduardo
Eyben, Florian
Schuller, Bjoern
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2519 - 2523
[39] Automatic segmentation of histopathological slides of renal tissue using deep learning
de Bel, Thomas
Hermsen, Meyke
Smeets, Bart
Hilbrands, Luuk
van der Laak, Jeroen
Litjens, Geert
[J]. MEDICAL IMAGING 2018: DIGITAL PATHOLOGY, 2018, 10581
[40] ANALYSIS-BY-SYNTHESIS FEATURE ESTIMATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION USING SPECTRAL MASKS
Mandel, Michael I.
Narayanan, Arun
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →