An Open Source Software System For Robot Audition HARK and Its Evaluation

被引:0
|
作者
Nakadai, Kazuhiro
Okuno, Hiroshi G.
Nakajima, Hirofumi
Hasegawa, Yuji
Tsujino, Hiroshi
机构
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called "HARK", which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which provides real-time processing. HARK's performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.
引用
收藏
页码:709 / 714
页数:6
相关论文
共 50 条
  • [1] Design and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers
    Nakadai, Kazuhiro
    Takahashi, Toru
    Okuno, Hiroshi G.
    Nakajima, Hirofumi
    Hasegawa, Yuji
    Tsujino, Hiroshi
    [J]. ADVANCED ROBOTICS, 2010, 24 (5-6) : 739 - 761
  • [2] Sound Annotation Tool for Multidirectional Sounds based on spatial information extracted by HARK robot audition software
    Sugiyama, Osamu
    Itoyama, Katsutoshi
    Nakada, Kazuhiro
    Okuno, Hiroshi G.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2335 - 2340
  • [3] Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array
    Suzuki, Reiji
    Matsubayashi, Shiho
    Nakadai, Kazuhiro
    Okuno, Hiroshi G.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2626 - 2630
  • [4] Extracting the Relationship between the Spatial Distribution and Types of Bird Vocalizations Using Robot Audition System HARK
    Sumitani, Shinji
    Suzuki, Reiji
    Matsubayashi, Shiho
    Arita, Takaya
    Nakadai, Kazuhiro
    Okuno, Hiroshi G.
    [J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 2485 - 2490
  • [5] Evaluation of Open Source Software and Improving its Quality
    Khatri, Sunil Kumar
    Singh, Ispreet
    [J]. 2016 5TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO), 2016, : 114 - 119
  • [6] Open source software - an evaluation
    Fuggetta, A
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2003, 66 (01) : 77 - 90
  • [7] A spatiotemporal analysis of acoustic interactions between great reed warblers (Acrocephalus arundinaceus) using microphone arrays and robot audition software HARK
    Suzuki, Reiji
    Matsubayashi, Shiho
    Saito, Fumiyuki
    Murate, Tatsuyoshi
    Masuda, Tomohisa
    Yamamoto, Koichi
    Kojima, Ryosuke
    Nakadai, Kazuhiro
    Okuno, Hiroshi G.
    [J]. ECOLOGY AND EVOLUTION, 2018, 8 (01): : 812 - 825
  • [8] Evaluation of Firewall Open Source Software
    Sampaio, Diogo
    Bernardino, Jorge
    [J]. WEBIST: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2017, : 356 - 362
  • [9] Evaluation framework for open source software
    Koponen, T
    Hotti, V
    [J]. SERP'04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH AND PRACTICE, VOLS 1 AND 2, 2004, : 897 - 902
  • [10] ROBOT AUDITION: ITS RISE AND PERSPECTIVES
    Okuno, Hiroshi G.
    Nakadai, Kazuhiro
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5610 - 5614