An Intelligent Retrieval Method for Audio and Video Content: Deep Learning Technology Based on Artificial Intelligence

被引:0
|
作者
Sun, Maojin [1 ]
机构
[1] CEICloud Data Storage Technol Beijing Co Ltd, Beijing 101111, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Accuracy; 5G mobile communication; Deep learning; Visualization; Audio-visual systems; Information retrieval; Information systems; Audio-video content retrieval; deep learning; feature extraction; cross-modal retrieval; intelligent retrieval;
D O I
10.1109/ACCESS.2024.3450920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the challenges of efficient intelligent retrieval and cross-modal analysis brought by the surge in audio-video data, this study proposes an intelligent retrieval method for audio-video content based on deep learning techniques, aimed at improving retrieval efficiency and accuracy. This method extracts audio features using the Visual Geometry Group Network (VGG) and employs an adaptive clustering keyframe extraction algorithm (SKM) to extract video features. By integrating cross-learning within an embedding network, it enhances retrieval efficiency and accuracy. The test results on the CMU-MOSEI dataset demonstrate that our method outperforms traditional models such as Principal Component Analysis (PCA), Canonical Correlation Analysis (CCA), and state-of-the-art deep learning models like Deep Canonical Correlation Analysis (DCCA) and Domain-Adversarial Neural Network (DANN) in multimodal data processing and real-world retrieval tasks. In video processing, the average fidelity is 0.693, and the average compression ratio is 0.936, representing improvements of 30.75% and 7.09%, respectively, compared to traditional methods. Through the application of deep learning technology, this study not only optimizes the processing of single modalities but also enhances the handling of cross-modal data through a cross-learning framework.
引用
收藏
页码:123430 / 123446
页数:17
相关论文
共 50 条
  • [1] Intelligent Retrieval Method for Multimedia Digital Audio Based on Deep Learning
    Zhang S.
    Lin Y.
    Chen L.
    Jiang C.
    [J]. Journal of Engineering Science and Technology Review, 2023, 16 (06) : 195 - 201
  • [2] Research on Educational Video Retrieval Method Based on Audio Transcription Technology
    Zhao, Muqiang
    Zheng, Wenxi
    Ye, Yan
    Wu, Min
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC, CONTROL AND AUTOMATION ENGINEERING (MECAE 2018), 2018, 149 : 384 - 388
  • [3] INTELLIGENT ROUTE PLANNING METHOD FOR UAV BASED ON SWARM INTELLIGENCE AND DEEP LEARNING TECHNOLOGY
    Yang, Jian
    Huang, Xuejun
    [J]. COMPUTING AND INFORMATICS, 2024, 43 (04) : 874 - 899
  • [4] APPLICATION OF ARTIFICIAL INTELLIGENCE TECHNOLOGY AND DEEP LEARNING IN LABORATORY INTELLIGENT MANAGEMENT PLATFORM
    Lu, Xing
    [J]. SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (05): : 3251 - 3258
  • [5] Face Detection in Security Monitoring Based on Artificial Intelligence Video Retrieval Technology
    Dong, Zuolin
    Wei, Jiahong
    Chen, Xiaoyu
    Zheng, Pengfei
    [J]. IEEE ACCESS, 2020, 8 (08): : 63421 - 63433
  • [6] Research on Big Data Artificial Intelligence Technology Based on Deep Learning
    Mei, Guang
    [J]. PROCEEDINGS OF THE WORLD CONFERENCE ON INTELLIGENT AND 3-D TECHNOLOGIES, WCI3DT 2022, 2023, 323 : 243 - 250
  • [7] Deep learning for content-based video retrieval in film and television production
    Muehling, Markus
    Korfhage, Nikolaus
    Mueller, Eric
    Otto, Christian
    Springstein, Matthias
    Langelage, Thomas
    Veith, Uli
    Ewerth, Ralph
    Freisleben, Bernd
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (21) : 22169 - 22194
  • [8] Deep learning for content-based video retrieval in film and television production
    Markus Mühling
    Nikolaus Korfhage
    Eric Müller
    Christian Otto
    Matthias Springstein
    Thomas Langelage
    Uli Veith
    Ralph Ewerth
    Bernd Freisleben
    [J]. Multimedia Tools and Applications, 2017, 76 : 22169 - 22194
  • [9] Intelligent Tutoring System, Based on Video E-learning, for Teaching Artificial Intelligence
    Bailon, Antonio
    Fajardo, Waldo
    Molina-Solana, Miguel
    [J]. TRENDS IN PRACTICAL APPLICATIONS OF AGENTS, MULTI-AGENT SYSTEMS AND SUSTAINABILITY: THE PAAMS COLLECTION, 2015, 372 : 215 - 224
  • [10] Deep Learning-Based Video Retrieval Using Object Relationships and Associated Audio Classes
    Kim, Byoungjun
    Shim, Ji Yea
    Park, Minho
    Ro, Yong Man
    [J]. MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 803 - 808