Calibrating machine behavior: a challenge for AI alignment

被引:4
|
作者
Firt, Erez [1 ,2 ]
机构
[1] Univ Haifa, Ctr Humanities & Artificial Intelligence, Haifa, Israel
[2] Technion, Haifa, Israel
关键词
Artificial intelligence; Machine ethics; AI alignment; Autonomous AI; Trustworthiness;
D O I
10.1007/s10676-023-09716-8
中图分类号
B82 [伦理学(道德学)];
学科分类号
摘要
When discussing AI alignment, we usually refer to the problem of teaching or training advanced autonomous AI systems to make decisions that are aligned with human values or preferences. Proponents of this approach believe it can be employed as means to stay in control over sophisticated intelligent systems, thus avoiding certain existential risks. We identify three general obstacles on the path to implementation of value alignment: a technological/technical obstacle, a normative obstacle, and a calibration problem. Presupposing, for the purposes of this discussion, that the technical and normative problems are solved, we focus on the problem of how to calibrate a system, for a specific value, to be on a specific location within a spectrum stretching between righteous and normal or average human behavior. Calibration, or more specifically mis-calibration, also raises the issue of trustworthiness. If we cannot trust AI systems to perform tasks the way we intended, we would not use them on our roads and at our homes. In an era where we strive to construct autonomous machines endowed with common sense, reasoning abilities and a connection to the world, so they would be able to act in alignment with human values, such mis-calibrations can make the difference between trustworthy and untrustworthy systems.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Calibrating machine behavior: a challenge for AI alignment
    Erez Firt
    [J]. Ethics and Information Technology, 2023, 25
  • [2] Honor Ethics: The Challenge of Globalizing Value Alignment in AI
    Wu, Stephen Tze-Inn
    Demetriou, Dan
    Husain, Rudwan Ali
    [J]. PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023, 2023, : 593 - 602
  • [3] AI Alignment Dialogues: An Interactive Approach to AI Alignment in Support Agents
    Chen, Pei-Yu
    [J]. PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 894 - 894
  • [4] Machine alignment
    Harperlove, Wayne Porell
    [J]. International Paper Board Industry, 2019, 62 (05): : 86 - 88
  • [5] AI Alignment and Human Reward
    Butlin, Patrick
    [J]. AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 437 - 445
  • [6] Action Guidance and AI Alignment
    Robinson, Pamela
    [J]. PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 387 - 395
  • [7] Machine alignment
    [J]. Papermaker, 1994, 57 (12):
  • [8] Compassionate AI and the Alignment Problem
    Graves, Mark
    Compson, Jane
    Bhojani, Ali-Reza
    Olsen, Cyrus
    Arnold, Thomas
    [J]. THEOLOGY AND SCIENCE, 2024, 22 (01) : 4 - 8
  • [9] AI, alignment, and the categorical imperative
    Fritz J. McDonald
    [J]. AI and Ethics, 2023, 3 (1): : 337 - 344
  • [10] AI MACHINE
    HATTORI, A
    MASUZAWA, H
    HAYASHI, H
    [J]. FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 1987, 23 (04): : 369 - 378