AI 2.0时代的类人与超人感知:研究综述与趋势展望（英文）

被引：11

作者：

Yong-hong TIAN ^{[1
]}

Xi-lin CHEN ^{[2
]}

Hong-kai XIONG ^{[3
]}

Hong-liang LI ^{[4
]}

Li-rong DAI ^{[5
]}

Jing CHEN ^{[1
]}

Jun-liang XING ^{[6
]}

Xi-hong WU ^{[1
]}

Wei-min HU ^{[6
]}

Yu HU ^{[5
]}

Tie-jun HUANG ^{[1
]}

Wen GAO ^{[1
]}

机构：

[1] School of Electronics Engineering and Computer Science,Peking University

[2] Institute of Computing Technology,Chinese Academy of Sciences

[3] Department of Electronic Engineering,Shanghai Jiao Tong University

[4] School of Electronic Engineering,University of Electronic Science and Technology of China

[5] Department of Electronic Engineering and Information Sciences,University of Science and Technology of China

[6] Institute of Automation,Chinese Academy of Sciences

[7] School of Optoelectronics,Beijing Institute of Technology

来源：

Frontiers of Information Technology & Electronic Engineering | 2017年 / 18卷 / 01期

关键词：

智能感知; 主动视觉; 听觉感知; 言语感知; 自主学习;

D O I：

暂无

中图分类号：

R338 [神经生理学]; TP18 [人工智能理论];

学科分类号：

0710 ; 071006 ; 081104 ; 0812 ; 0835 ; 1405 ;

摘要：

感知是智能系统与现实世界的交互界面。如果没有复杂而灵活的感知能力,就不可能创造出高级的人工智能(Artificial intelligence,AI)系统。最近,潘云鹤院士提出了AI 2.0的概念,其最重要的特征就是未来的AI系统应拥有类人甚至超人的智能感知能力。本文简要回顾了不同智能感知领域的研究现状,包括视觉感知、听觉感知、言语感知、感知信息处理与学习引擎等方面。在此基础上,论文对即将到来的AI 2.0时代智能感知领域需要大力研究发展的重点方向进行了展望,包括:(1)类人和超人的主动视觉;(2)自然声学场景的听知觉感知;(3)自然交互环境的言语感知及计算;(4)面向媒体感知的自主学习;(5)大规模感知信息处理与学习引擎;(6)城市全维度智能感知推理引擎。这些研究方向应在未来AI 2.0的研究规划中进行重点布局。

引用

页码：58 / 68

页数：11

共 50 条

[1] WaveN et:a generative model for raw audio. Oord,A,Dieleman,S,Zen,H.et al. . 2016
[2] Enhanced computer vision with Microsoft Kinect Sensor:A Review. Han J,Shao L,Xu D,et al. IEEE Transactions on Systems Man and Cybernetics . 2013
[3] Coded time of flight cameras[J] . Achuta Kadambi,Refael Whyte,Ayush Bhandari,Lee Streeter,Christopher Barsi,Adrian Dorrington,Ramesh Raskar. &nbspACM Transactions on Graphics (TOG) . 2013 (6)
[4] A Survey of Urban Reconstruction
Musialski, P.
Wonka, P.
Aliaga, D. G.
Wimmer, M.
van Gool, L.
Purgathofer, W.
[J]. COMPUTER GRAPHICS FORUM, 2013, 32 (06) : 146 - 177
[5] Speech recognition in adverse conditions: A review[J] . SvenL. Mattys,MatthewH. Davis,AnnR. Bradlow,SophieK. Scott. &nbspLanguage and Cognitive Processes . 2012 (7-8)
[6] Long short-term memory
Hochreiter, S
Schmidhuber, J
[J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
[7] Speech recognition by machines and humans
Lippmann, RP
[J]. SPEECH COMMUNICATION, 1997, 22 (01) : 1 - 15
[8] Speech synthesisbased on hidden Markov models. Tokuda K,Nankaku Y,Toda T,et al. Proceedings of theIEEE . 2013
[9] "Sequence-discriminative training ofdeep neural net-works,". Vesely, K,Ghoshal, A,Burget, L,Povey, D. Interspeech . 2013
[10] Person reidentification:past,present and future. Zheng,L,Yang,Y,Hauptmann,A.G. . 2016

← 1 2 3 4 5 →