Dataset Cleaning - A Cross Validation Methodology for Large Facial Datasets using Face Recognition

被引:0
|
作者
Varkarakis, Viktor [1 ]
Corcoran, Peter [1 ]
机构
[1] Natl Univ Ireland Galway, Sch Engn, Galway, Ireland
基金
爱尔兰科学基金会;
关键词
face datasets; mislabeled identities; noisy samples; clean face dataset; semi-automatic cleaning; CelebA;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, large "in the wild" face datasets have been released in an attempt to facilitate progress in tasks such as face detection, face recognition, and other tasks. Most of these datasets are acquired from webpages with automatic procedures. As a consequence, noisy data are often found. Furthermore, in these large face datasets, the annotation of identities is important as they are used for training face recognition algorithms. But due to the automatic way of gathering these datasets and due to their large size, many identities folder contain mislabeled samples which deteriorates the quality of the datasets. In this work, it is presented a semiautomatic method for cleaning the noisy large face datasets with the use of face recognition. This methodology is applied to clean the CelebA dataset and show its effectiveness. Furthermore, the list with the mislabelled samples in the CelebA dataset is made available.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] IMPROVING CROSS-DATASET PERFORMANCE OF FACE PRESENTATION ATTACK DETECTION SYSTEMS USING FACE RECOGNITION DATASETS
    Mohammadi, Amir
    Bhattacharjee, Sushil
    Marcel, Sebastien
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2947 - 2951
  • [2] Face Recognition Method by Using Large and Representative Datasets
    Zhao Tongzhou
    Wang Yanli
    Wang Haihui
    Gao Sheng
    Song Hongxian
    [J]. CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 5059 - 5062
  • [3] Masked Face Recognition Datasets and Validation
    Huang, Baojin
    Wang, Zhongyuan
    Wang, Guangcheng
    Jiang, Kui
    He, Zheng
    Zou, Hua
    Zou, Qin
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1487 - 1491
  • [4] Face recognition in a large dataset using a hierarchical classifier
    Navid Abbaspoor
    Hamid Hassanpour
    [J]. Multimedia Tools and Applications, 2022, 81 : 16477 - 16495
  • [5] Face recognition in a large dataset using a hierarchical classifier
    Abbaspoor, Navid
    Hassanpour, Hamid
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (12) : 16477 - 16495
  • [6] Cross-Dataset Facial Expression Recognition
    Yan, Haibin
    Ang, Marcelo H., Jr.
    Poo, Aun Neow
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2011,
  • [7] Facial Expression Recognition Using a Large Out-of-Context Dataset
    Tran, Elizabeth
    Mayhew, Michael B.
    Kim, Hyojin
    Karande, Piyush
    Kaplan, Alan D.
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2018), 2018, : 52 - 59
  • [8] A DATA-DRIVEN APPROACH TO CLEANING LARGE FACE DATASETS
    Ng, Hong-Wei
    Winkler, Stefan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 343 - 347
  • [9] SegTex: A Large Scale Synthetic Face Dataset for Face Recognition
    Ambardi, Laudwika
    Hong, Sungeun
    Park, In Kyu
    [J]. IEEE ACCESS, 2023, 11 : 131939 - 131949
  • [10] Face Recognition Using a Facial Recognition System
    Almurayziq, Tariq S.
    Alazani, Abdullah
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (09): : 280 - 286