Data-centric AI: Techniques and Future Perspectives

被引:9
|
作者
Zha, Daochen [1 ]
Lai, Kwei-Herng [2 ]
Yang, Fan [3 ]
Zou, Na [4 ]
Gao, Huiji [1 ]
Hu, Xia [2 ]
机构
[1] Airbnb Inc, San Francisco, CA 94103 USA
[2] Rice Univ, Houston, TX USA
[3] Wake Forest Univ, Winston Salem, NC USA
[4] Texas A&M Univ, College Stn, TX USA
关键词
D O I
10.1145/3580305.3599553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The role of data in AI has been significantly magnified by the emerging concept of data-centric AI. In contrast to the traditional model-centric paradigm, which focuses on developing more effective models given fixed datasets, data-centric AI emphasizes the systematic engineering of data in building AI systems. However, as a new concept, many critical aspects of data-centric AI remain ambiguous, such as its definitions, associated tasks, algorithms, challenges, and benchmarks. This tutorial aims to review and discuss this emerging field, with a particular focus on the three general data-centric AI goals: training data development, inference data development, and data maintenance. The objective of this tutorial is threefold: (1) to formally categorize the field of data-centric AI using a goal-driven taxonomy and discuss the needs and challenges of each goal, (2) to comprehensively review the state-of-the-art techniques, and (3) to discuss the future perspectives and open research directions to inspire further innovations in this field.
引用
收藏
页码:5839 / 5840
页数:2
相关论文
共 50 条
  • [41] Practical data-centric storage
    Ee, Cheng Tien
    Ratnasamy, Sylvia
    Shenker, Scott
    USENIX ASSOCIATION PROCEEDINGS OF THE 3RD SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI 06), 2006, : 325 - +
  • [42] Integrating Model-Centric and Data-Centric Techniques for Pipe System Prognostics and Health Management
    Braydi, Ahmad
    Fossat, Pascal
    Casaburo, Alessandro
    Pernet, Victor
    Zwick, Cyril
    Ardabilian, Mohsen
    Bareille, Olivier
    e-Journal of Nondestructive Testing, 2024, 29 (07):
  • [43] Data-centric AI and cancer research: constructing a research data access pipeline using XNAT
    Butterworth, Victoria
    Vilic, Dijana
    Al Jazzaf, Haleema
    Young, Thomas
    Palmer, Isabel
    Avgoulea, Tania
    Andriolo, Josh
    Creppy, Carole
    Routledge, Corla
    Misson-Yates, Sarah
    Guerrero-Urbano, Teresa
    RADIOTHERAPY AND ONCOLOGY, 2024, 194 : S2975 - S2977
  • [44] Data-Centric Interactions on the Web
    Diaz, Paloma
    Hussein, Tim
    Lohmann, Steffen
    Ziegler, Juergen
    HUMAN-COMPUTER INTERACTION - INTERACT 2011, PT IV, 2011, 6949 : 726 - 727
  • [45] Data-centric storage in sensornets
    Shenker, S
    Ratnasamy, S
    Karp, B
    Govindan, R
    Estrin, D
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2003, 33 (01) : 137 - 142
  • [46] Gaspar Data-Centric Framework
    Silva, Rui
    Sobral, J. L.
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 234 - 247
  • [47] Data-Centric Intelligent Computing
    Shen, Jun
    Hung, Chih-Cheng
    Beydoun, Ghassan
    Li, Yan
    Guo, William
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 616 - 617
  • [48] A DATA-CENTRIC COGNITIVE GATEWAY WITH DISTRIBUTED MIMO FOR FUTURE SMART HOMES
    Li, Zhouzhou
    Fang, Hua
    Wang, Honggang
    Daneshmand, Mahmoud
    IEEE WIRELESS COMMUNICATIONS, 2019, 26 (03) : 40 - 46
  • [49] Bridging Control-Centric and Data-Centric Optimization
    Ben-Nun, Tal
    Ates, Berke
    Calotoiu, Alexandru
    Hoefler, Torsten
    PROCEEDINGS OF THE 21ST ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, CGO 2023, 2023, : 173 - 185
  • [50] Rapidly predicting Kohn-Sham total energy using data-centric AI
    Kurban, Hasan
    Kurban, Mustafa
    Dalkilic, Mehmet M.
    SCIENTIFIC REPORTS, 2022, 12 (01)