Visual-Auditory saliency detection using event-driven visual sensors

被引：0

作者：

Akolkar, Himanshu ^{[1
]}

Valeiras, David Reverter ^{[2
]}

Benosman, Ryad ^{[2
]}

Bartolozzi, Chiara ^{[1
]}

机构：

[1] Ist Italiano Tecnol, ICub Facil, I-16163 Genoa, Italy

[2] Univ Paris 06, Vis Inst, F-75012 Paris, France

来源：

PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE ON EVENT-BASED CONTROL, COMMUNICATION AND SIGNAL PROCESSING EBCCSP 2015 | 2015年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents a novel architecture for audiovisual saliency detection using event-based visual sensors and traditional microphones installed on the head of a humanoid robot. In the context of collision detection, salient sensory events must be detected at the same time in vision and in the auditory domain. Real collisions in the visual space can be distinguished from fake ones (e.g. due to movements of two objects that occlude each other) because they generate a sound at the time of collision. This temporal coincidence is extremely difficult to detect with frame-based sensors, that intrinsically add a fixed delay in the sensory acquisition or can miss the collision. The high temporal resolution of event-driven vision sensors together with a real time clustering and tracking algorithm allow for the detection of potential collisions with very low latency. Auditory events corresponding to collisions are detected using simple spectral analysis of auditory signals. The visual event can be therefore temporally integrated with coherently occurring auditory events to detect fast-transitions and disentangle real collisions from visual or auditory events that do not correspond to any. The proposed audio-visual collision detection is used in the context of human robot interaction, to detect people clapping in front of the robot and orient its gaze toward the perceived collision.

引用

页数：6

共 50 条

[41] Multiple kernel visual-auditory representation learning for retrieval
Hong Zhang
Wenping Zhang
Wenhe Liu
Xin Xu
Hehe Fan
Multimedia Tools and Applications, 2016, 75 : 9169 - 9184
[42] Visual-auditory integration during speech imitation in autism
Williams, JHG
Massaro, DW
Peel, NJ
Bosseler, A
Suddendorf, T
RESEARCH IN DEVELOPMENTAL DISABILITIES, 2004, 25 (06) : 559 - 575
[43] An Event-Driven Ultra-Low-Power Smart Visual Sensor
Rusci, Manuele
Rossi, Davide
Lecca, Michela
Gottardi, Massimo
Farella, Elisabetta
Benini, Luca
IEEE SENSORS JOURNAL, 2016, 16 (13) : 5344 - 5353
[44] Multiple kernel visual-auditory representation learning for retrieval
Zhang, Hong
Zhang, Wenping
Liu, Wenhe
Xu, Xin
Fan, Hehe
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 9169 - 9184
[45] Effects of Visual-Auditory Incongruity on Product Expression and Surprise
Ludden, Geke D. S.
Schifferstein, Hendrik N. J.
INTERNATIONAL JOURNAL OF DESIGN, 2007, 1 (03): : 29 - 39
[46] Multisensory enhancement of localization with synergetic visual-auditory cues
Godfroy, Martine
Roumes, Corinne
PROCEEDINGS OF THE TWENTY-SIXTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2004, : 470 - 475
[47] A NEW TYPE OF VISUAL-AUDITORY INTERACTION - VISUAL-MOTION CAN ELICIT A SUBJECTIVE AUDITORY MOTION
MATEEFF, S
HOHNSBEIN, J
NOACK, T
PFLUGERS ARCHIV-EUROPEAN JOURNAL OF PHYSIOLOGY, 1984, 402 : R53 - R53
[48] ACQUISITION OF VISUAL-AUDITORY ASSOCIATIONS BY GOOD AND EXCELLENT READERS
BARTHOLOMEUS, BN
DOEHRING, DG
PERCEPTUAL AND MOTOR SKILLS, 1972, 35 (03) : 847 - 855
[49] Top-down saliency detection driven by visual classification
Murabito, Francesca
Spampinato, Concetto
Palazzo, Simone
Giordano, Daniela
Pogorelov, Konstantin
Riegler, Michael
COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 172 : 67 - 76
[50] Bio-driven visual saliency detection with color factor
Wang, Yan
Li, Teng
Wu, Jun
Ding, Chris H. Q.
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 10

← 1 2 3 4 5 →