Listening with generative models

被引:0
|
作者
Cusimano, Maddie [1 ]
Hewitt, Luke B. [1 ]
McDermott, Josh H. [1 ,2 ,3 ,4 ]
机构
[1] MIT, Dept Brain & Cognit Sci, Cambridge, MA 02139 USA
[2] MIT, McGovern Inst, Cambridge, MA USA
[3] MIT, Ctr Brains Minds & Machines, Cambridge, MA USA
[4] Harvard Univ, Speech & Hearing Biosci & Technol Program, Cambridge, MA USA
关键词
Auditory scene analysis; Bayesian inference; Illusions; Grouping; Perceptual organization; Natural sounds; Probabilistic program; World model; Perception; COCKTAIL PARTY; GESTALT PSYCHOLOGY; NEWBORN-INFANTS; SOUND SOURCES; PERCEPTION; SPEECH; SEPARATION; ORGANIZATION; STATISTICS; STREAM;
D O I
10.1016/j.cognition.2024.105874
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Perception has long been envisioned to use an internal model of the world to explain the causes of sensory signals. However, such accounts have historically not been testable, typically requiring intractable search through the space of possible explanations. Using auditory scenes as a case study, we leveraged contemporary computational tools to infer explanations of sounds in a candidate internal generative model of the auditory world (ecologically inspired audio synthesizers). Model inferences accounted for many classic illusions. Unlike traditional accounts of auditory illusions, the model is applicable to any sound, and exhibited human-like perceptual organization for real-world sound mixtures. The combination of stimulus-computability and interpretable model structure enabled 'rich falsification', revealing additional assumptions about sound generation needed to account for perception. The results show how generative models can account for the perception of both classic illusions and everyday sensory signals, and illustrate the opportunities and challenges involved in incorporating them into theories of perception.
引用
收藏
页数:64
相关论文
共 50 条
  • [21] Generative models for fast simulation
    Vallecorsa, S.
    18TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2017), 2018, 1085
  • [22] Metrics for Deep Generative Models
    Chen, Nutan
    Klushyn, Alexej
    Kurle, Richard
    Jiang, Xueyan
    Bayer, Justin
    van der Smagt, Patrick
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [23] Diffusion Models in Generative AI
    Sazara, Cem
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9705 - 9706
  • [24] Mechanisms and generative material models
    Sim-Hui Tee
    Synthese, 2021, 198 : 6139 - 6157
  • [25] Diffeomorphic Counterfactuals With Generative Models
    Dombrowski, Ann-Kathrin
    Gerken, Jan
    Muller, Klaus-Robert
    Kessel, Pan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3257 - 3274
  • [26] Generative Models for Chemical Structures
    White, David
    Wilson, Richard C.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (07) : 1257 - 1274
  • [27] Predictive models and generative complexity
    Wolfgang Löhr
    Journal of Systems Science and Complexity, 2012, 25 : 30 - 45
  • [28] On the evaluation of generative models in music
    Li-Chia Yang
    Alexander Lerch
    Neural Computing and Applications, 2020, 32 : 4773 - 4784
  • [29] Generative Models of Brain Dynamics
    Ramezanian-Panahi, Mahta
    Abrevaya, German
    Gagnon-Audet, Jean-Christophe
    Voleti, Vikram
    Rish, Irina
    Dumas, Guillaume
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [30] Subspace Diffusion Generative Models
    Jing, Bowen
    Corso, Gabriele
    Berlinghieri, Renato
    Jaakkola, Tommi
    COMPUTER VISION, ECCV 2022, PT XXIII, 2022, 13683 : 274 - 289