End-to-End Deep Neural Network Age Estimation

被引：31

作者：

Ghahremani, Pegah ^{[1
]}

Nidadavolu, Phani Sankar ^{[1
]}

Chen, Nanxin ^{[1
]}

Villalba, Jesus ^{[1
]}

Povey, Daniel ^{[1
,2
]}

Khudanpur, Sanjeev ^{[1
,2
]}

Dehak, Najim ^{[1
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

Age identification; x-vector; i-vector;

D O I：

10.21437/Interspeech.2018-2015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we apply the recently proposed x-vector neural network architecture for the task of age estimation. This architecture maps a variable length utterance into a fixed dimensional embedding which retains the relevant sequence level information. This is achieved by a temporal pooling layer. From the embedding, a series of layers is applied to make predictions. The full network is trained end-to-end in a discriminative fashion. This kind of network is starting to outperform the state-of-the-art i-vector embeddings in tasks like speaker and language recognition. Motivated by this, we investigated the optimum way to train x-vectors for the age estimation task. Despite that a regression objective is typical for this task, we found that optimizing a mixture of classification and regression losses provides better results. We trained our models on the NIST SRE08 dataset and evaluated on SREIO. The proposed approach improved mean absolute error (MAE) by 12% w.r.t the i-vector baseline.

引用

页码：277 / 281

页数：5

共 50 条

[1] End-to-End Learning of Semantic Grid Estimation Deep Neural Network with Occupancy Grids
Erkent, Ozgur
Wolf, Christian
Laugier, Christian
[J]. UNMANNED SYSTEMS, 2019, 7 (03) : 171 - 181
[2] An End-to-End Deep Neural Network for Facial Emotion Classification
Jalal, Md Asif
Mihaylova, Lyudmila
Moore, Roger K.
[J]. 2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
[3] End-to-End Hardware Accelerator for Deep Convolutional Neural Network
Chang, Tian-Sheuan
[J]. 2018 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2018,
[4] Deep Neural Networks Based End-to-End DOA Estimation System
Ando, Daniel Akira
Kase, Yuya
Nishimura, Toshihiko
Sato, Takanori
Ohganey, Takeo
Ogawa, Yasutaka
Hagiwara, Junichiro
[J]. IEICE TRANSACTIONS ON COMMUNICATIONS, 2023, E106B (12) : 1350 - 1362
[5] DeepChess :End-to-End Deep Neural Network for Automatic Learning in Chess
David, Omid E.
Netanyahu, Nathan S.
Wolf, Lior
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 88 - 96
[6] End-to-End Deep Neural Network for Illumination Consistency and Global Illumination
Huang Jingtao
Komuro, Takashi
[J]. ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 392 - 403
[7] DeepVCP: An End-to-End Deep Neural Network for Point Cloud Registration
Lu, Weixin
Wan, Guowei
Zhou, Yao
Fu, Xiangyu
Yuan, Pengfei
Song, Shiyu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 12 - 21
[8] Low Rank Based End-to-End Deep Neural Network Compression
Jain, Swayambhoo
Hamidi-Rad, Shahab
Racape, Fabien
[J]. 2021 DATA COMPRESSION CONFERENCE (DCC 2021), 2021, : 233 - 242
[9] Absorption Attenuation Compensation Using an End-to-End Deep Neural Network
Zhou, Chen
Wang, Shoudong
Wang, Zixu
Cheng, Wanli
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[10] FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition
Seong, Hongje
Hyun, Junhyuk
Kim, Euntai
[J]. IEEE ACCESS, 2020, 8 : 82066 - 82077

← 1 2 3 4 5 →