The Casual Conversations v2 Dataset A diverse, large benchmark for measuring fairness and robustness in audio/vision/speech models

被引:1
|
作者
Porgali, Bilal [1 ]
Albiero, Vitor [1 ]
Ryda, Jordan [1 ]
Ferrer, Cristian Canton [1 ]
Hazirbas, Caner [1 ]
机构
[1] Meta AI, Menlo Pk, CA 94025 USA
关键词
D O I
10.1109/CVPRW59228.2023.00006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a new large consent-driven dataset aimed at assisting in the evaluation of algorithmic bias and robustness of computer vision and audio speech models in regards to 11 attributes that are self-provided or labeled by trained annotators. The dataset includes 26,467 videos of 5,567 unique paid participants, with an average of almost 5 videos per person, recorded in Brazil, India, Indonesia, Mexico, Vietnam, Philippines, and the USA, representing diverse demographic characteristics. The participants agreed for their data to be used in assessing fairness of AI models and provided self-reported age, gender, language/dialect, disability status, physical adornments, physical attributes and geo-location information, while trained annotators labeled apparent skin tone using the Fitzpatrick Skin Type and Monk Skin Tone scales, and voice timbre. Annotators also labeled for different recording setups and per-second activity annotations.
引用
收藏
页码:10 / 17
页数:8
相关论文
共 2 条
  • [1] TOWARDS MEASURING FAIRNESS IN SPEECH RECOGNITION: CASUAL CONVERSATIONS DATASET TRANSCRIPTIONS
    Liu, Chunxi
    Picheny, Michael
    Sari, Leda
    Chitkara, Pooja
    Xiao, Alex
    Zhang, Xiaohui
    Chou, Mark
    Alvarado, Andres
    Hazirbas, Caner
    Saraf, Yatharth
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6162 - 6166
  • [2] Google Landmarks Dataset v2 A Large-Scale Benchmark for Instance-Level Recognition and Retrieval
    Weyand, Tobias
    Araujo, Andre
    Cao, Bingyi
    Sim, Jack
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2572 - 2581