Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation

被引：0

作者：

Feng, Yifan ^{[1
]}

Huang, Jiangang ^{[2
]}

Du, Shaoyi ^{[2
]}

Ying, Shihui ^{[3
]}

Yong, Jun-Hai ^{[1
]}

Li, Yipeng ^{[4
]}

Ding, Guiguang ^{[1
]}

Ji, Rongrong ^{[5
]}

Gao, Yue ^{[1
]}

机构：

[1] Tsinghua Univ, Sch Software, BLBCI, THUIBCS,BNRist, Beijing 100084, Peoples R China

[2] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Coll Artificial Intelligence, Xian 710049, Peoples R China

[3] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China

[4] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

[5] Xiamen Univ, Media Analyt & Comp Lab, Dept Artificial Intelligence, Sch Informat Inst Artificial Intelligence,Fujian, Xiamen 361005, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2025年 / 47卷 / 04期

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

Feature extraction; YOLO; Semantics; Neck; Computational modeling; Visualization; Correlation; Computer vision; Scattering; Electronic mail; Hypergraph; hypergraph computation; hypergraph nerual networks; object detection;

D O I：

10.1109/TPAMI.2024.3524377

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce Hyper-YOLO, a new object detection method that integrates hypergraph computations to capture the complex high-order correlations among visual features. Traditional YOLO models, while powerful, have limitations in their neck designs that restrict the integration of cross-level features and the exploitation of high-order feature interrelationships. To address these challenges, we propose the Hypergraph Computation Empowered Semantic Collecting and Scattering (HGC-SCS) framework, which transposes visual feature maps into a semantic space and constructs a hypergraph for high-order message propagation. This enables the model to acquire both semantic and structural information, advancing beyond conventional feature-focused learning. Hyper-YOLO incorporates the proposed Mixed Aggregation Network (MANet) in its backbone for enhanced feature extraction and introduces the Hypergraph-Based Cross-Level and Cross-Position Representation Network (HyperC2Net) in its neck. HyperC2Net operates across five scales and breaks free from traditional grid structures, allowing for sophisticated high-order interactions across levels and positions. This synergy of components positions Hyper-YOLO as a state-of-the-art architecture in various scale models, as evidenced by its superior performance on the COCO dataset. Specifically, Hyper-YOLO-N significantly outperforms the advanced YOLOv8-N and YOLOv9-T with 12% AP(val) and 9% AP(val) improvements.

引用

页码：2388 / 2401

页数：14

共 15 条

[1] Enhancing Steel Surface Defect Detection: A Hyper-YOLO Approach with Ghost Modules and Hyper FPN
Wu, Guinan
Wu, Qinghong
IAENG International Journal of Computer Science, 2024, 51 (09) : 1321 - 1330
[2] AVS-YOLO: Object Detection in Aerial Visual Scene
Ma, You
Chai, Lin
Jin, Lizuo
Yu, Yafeng
Yan, Jun
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
[3] When Object Detection Meets Knowledge Distillation: A Survey
Li, Zhihui
Xu, Pengfei
Chang, Xiaojun
Yang, Luyao
Zhang, Yuanyuan
Yao, Lina
Chen, Xiaojiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10555 - 10579
[4] When Few-Shot Learning Meets Video Object Detection
Yu, Zhongjie
Wang, Gaoang
Chen, Lin
Raschka, Sebastian
Luo, Jiebo
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2986 - 2992
[5] When super-resolution meets camouflaged object detection: A comparison study
Wen, Juan
Cheng, Shupeng
Hou, Weiyan
Van Gool, Luc
Timofte, Radu
COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 253
[6] Pedestrian object detection with fusion of visual attention mechanism and semantic computation
Xiao, Feng
Liu, Baotong
Li, Runa
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 14593 - 14607
[7] Pedestrian object detection with fusion of visual attention mechanism and semantic computation
Feng Xiao
Baotong Liu
Runa Li
Multimedia Tools and Applications, 2020, 79 : 14593 - 14607
[8] Enhancing UAV Visual Landing Recognition with YOLO's Object Detection by Onboard Edge Computing
Ma, Ming-You
Shen, Shang-En
Huang, Yi-Cheng
SENSORS, 2023, 23 (21)
[9] RTOD-YOLO: Traffic Object Detection in UAV Images Based on Visual Attention and Re-parameterization
Ma, Xuesen
Wei, Weixin
Dong, Jindian
Zheng, Biao
Ma, Ji
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[10] Decentralized detection for B5G massive MIMO: When local computation meets iterative algorithm
Yang, Qiyu
Yan, Jiayi
Zhang, Xia
Zhang, Hekun
PHYSICAL COMMUNICATION, 2022, 51

← 1 2 →