Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition

被引:20
|
作者
Su, Jingyong [1 ]
Srivastava, Anuj [2 ]
de Souza, Fillipe D. M. [3 ]
Sarkar, Sudeep [3 ]
机构
[1] Texas Tech Univ, Dept Math & Stat, Lubbock, TX 79409 USA
[2] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
[3] Univ S Florida, Dept Comp Sci & Engn, Tampa, FL 33620 USA
关键词
D O I
10.1109/CVPR.2014.86
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In statistical analysis of video sequences for speech recognition, and more generally activity recognition, it is natural to treat temporal evolutions of features as trajectories on Riemannian manifolds. However, different evolution patterns result in arbitrary parameterizations of these trajectories. We investigate a recent framework from statistics literature [15] that handles this nuisance variability using a cost function/distance for temporal registration and statistical summarization & modeling of trajectories. It is based on a mathematical representation of trajectories, termed transported square-root vector field (TSRVF), and the L-2 norm on the space of TSRVFs. We apply this framework to the problem of speech recognition using both audio and visual components. In each case, we extract features, form trajectories on corresponding manifolds, and compute parametrization-invariant distances using TSRVFs for speech classification. On the OuluVS database the classification performance under metric increases significantly, by nearly 100% under both modalities and for all choices of features. We obtained speaker-dependent classification rate of 70% and 96% for visual and audio components, respectively.
引用
收藏
页码:620 / 627
页数:8
相关论文
共 50 条
  • [1] Rate-Invariant Comparisons of Covariance Paths for Visual Speech Recognition
    Su, Jingyong
    Srivastava, Anuj
    Souza, Fillipe
    Sarkar, Sudeep
    [J]. 2013 FOURTH NATIONAL CONFERENCE ON COMPUTER VISION, PATTERN RECOGNITION, IMAGE PROCESSING AND GRAPHICS (NCVPRIPG), 2013,
  • [2] Rate-Invariant Analysis of Covariance Trajectories
    Zhengwu Zhang
    Jingyong Su
    Eric Klassen
    Huiling Le
    Anuj Srivastava
    [J]. Journal of Mathematical Imaging and Vision, 2018, 60 : 1306 - 1323
  • [3] Rate-Invariant Analysis of Covariance Trajectories
    Zhang, Zhengwu
    Su, Jingyong
    Klassen, Eric
    Le, Huiling
    Srivastava, Anuj
    [J]. JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2018, 60 (08) : 1306 - 1323
  • [4] Action Recognition Using Rate-Invariant Analysis of Skeletal Shape Trajectories
    Ben Amor, Boulbaba
    Su, Jingyong
    Srivastava, Anuj
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (01) : 1 - 13
  • [5] Rate-Invariant Recognition of Humans and Their Activities
    Veeraraghavan, Ashok
    Srivastava, Anuj
    Roy-Chowdhury, Amit K.
    Chellappa, Rama
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2009, 18 (06) : 1326 - 1339
  • [6] Human object interaction recognition using rate-invariant shape analysis of inter joint distances trajectories
    Meng, Meng
    Drira, Hassen
    Daoudi, Mohamed
    Boonaert, Jacques
    [J]. PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 999 - 1004
  • [7] Rate-Invariant Modeling in Lie Algebra for Activity Recognition
    Boujebli, Malek
    Drira, Hassen
    Mestiri, Makram
    Farah, Imed Riadh
    [J]. ELECTRONICS, 2020, 9 (11) : 1 - 16
  • [8] Kernel-Based Subspace Learning on Riemannian Manifolds for Visual Recognition
    Liu, Xi
    Ma, Zhengming
    [J]. NEURAL PROCESSING LETTERS, 2020, 51 (01) : 147 - 165
  • [9] Kernel-Based Subspace Learning on Riemannian Manifolds for Visual Recognition
    Xi Liu
    Zhengming Ma
    [J]. Neural Processing Letters, 2020, 51 : 147 - 165
  • [10] AFFINE INVARIANT FEATURES AND THEIR APPLICATION TO SPEECH RECOGNITION
    Qiao, Yu
    Suzuki, Masayuki
    Minematsu, Nobuaki
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4629 - 4632