Automatic Estimation of Presentation Skills Using Speech, Slides and Gestures

被引:3
|
作者
Hanani, Abualsoud [1 ]
Al-Amleh, Mohammad [1 ]
Bazbus, Waseem [1 ]
Salameh, Saleem [1 ]
机构
[1] Birzeit Univ, Birzeit, Palestine
来源
关键词
Presentation skills; Audio features; Gesture; Slides features; Multi-Modality;
D O I
10.1007/978-3-319-66429-3_17
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes an automatic system which uses multimodal techniques for automatically estimating oral presentation skills. It is based on a set of features from three sources; audio, gesture and power-point slides. Machine learning techniques are used to classify each presentation into two classes (high vs. low) and into three classes; low, average, and high-quality presentation. Around 448 Multimodal recordings of the MLA'14 dataset were used for training and evaluating three different 2-class and 3-class classifiers. Classifiers were evaluated for each feature type independently and for all features combined together. The best accuracy of the 2-class systems is 90.1% achieved by SVM trained on audio features and 75% for 3-class systems achieved by random forest trained on slides features. Combining three feature types into one vector improves all systems accuracy by around 5%.
引用
收藏
页码:182 / 191
页数:10
相关论文
共 50 条
  • [41] Speech Based Estimation of Parkinson's Disease Using Gaussian Processes and Automatic Relevance Determination
    Despotovic, Vladimir
    Skovranek, Tomas
    Schommer, Christoph
    [J]. NEUROCOMPUTING, 2020, 401 : 173 - 181
  • [42] Temporal symbolic integration applied to a multimodal system using gestures and speech
    Sowa, T
    Fröhlich, M
    Latoschik, ME
    [J]. GESTURE-BASED COMMUNICATION IN HUMAN-COMPUTER INTERACTION, 1999, 1739 : 291 - 302
  • [43] Speech-based Emotion Characterization using Postures and Gestures in CVEs
    Amarakeerthi, Senaka
    Ranaweera, Rasika
    Cohen, Michael
    [J]. 2010 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW 2010), 2010, : 72 - 76
  • [44] An evaluation of an augmented reality multimodal interface using speech and paddle gestures
    Irawati, Sylvia
    Green, Scott
    Billinghurst, Mark
    Duenser, Andreas
    Ko, Heedong
    [J]. ADVANCES IN ARTIFICIAL REALITY AND TELE-EXISTENCE, PROCEEDINGS, 2006, 4282 : 272 - 283
  • [45] Visual Support System for Semi-automatic Generation of Spadework of Academic Paper from Presentation Slides
    Morita, Yusuke
    Anzai, Hiroyuki
    Kaminaga, Hiroaki
    Miyadera, Youzou
    Nakamura, Shoichi
    [J]. 2012 INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS, 2012, : 812 - +
  • [46] The RAP System: Automatic Feedback of Oral Presentation Skills Using Multimodal Analysis and Low-Cost Sensors
    Ochoa, Xavier
    Dominguez, Federico
    Guaman, Bruno
    Maya, Ricardo
    Falcones, Gabriel
    Castells, Jaime
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE (LAK'18): TOWARDS USER-CENTRED LEARNING ANALYTICS, 2018, : 360 - 364
  • [47] MMSE Estimation of Speech Power Spectral Density Under Speech Presence Uncertainty for Automatic Speech Recognition
    Liu, Jingang
    Zhou, Yi
    Ma, Yongbao
    Liu, Hongqing
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 412 - 416
  • [48] USING THE GROUP PRESENTATION TO FOSTER FUNCTIONAL SKILLS
    KING, KM
    [J]. TEACHING SOCIOLOGY, 1990, 18 (01) : 74 - 77
  • [49] TEACHING SPEECH SKILLS USING ROLE MODELING
    Maimunah, Iffat
    [J]. IJAZ ARABI JOURNAL OF ARABIC LEARNING, 2019, 2 (01): : 50 - 63
  • [50] Disaster Day! Integrating speech skills though impromptu group research and presentation
    Pruim, Douglas E.
    [J]. COMMUNICATION TEACHER, 2016, 30 (02) : 62 - 66