Optimizing non-decomposable measures with deep networks

被引:0
|
作者
Amartya Sanyal
Pawan Kumar
Purushottam Kar
Sanjay Chawla
Fabrizio Sebastiani
机构
[1] The University of Oxford,
[2] The Alan Turing Institute,undefined
[3] Indian Institute of Technology Kanpur,undefined
[4] Qatar Computing Research Institute,undefined
[5] Istituto di Scienza e Tecnologia dell’Informazione,undefined
来源
Machine Learning | 2018年 / 107卷
关键词
Optimization; Deep learning; F-measure; Task-specific training;
D O I
暂无
中图分类号
学科分类号
摘要
We present a class of algorithms capable of directly training deep neural networks with respect to popular families of task-specific performance measures for binary classification such as the F-measure, QMean and the Kullback–Leibler divergence that are structured and non-decomposable. Our goal is to address tasks such as label-imbalanced learning and quantification. Our techniques present a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations offer several advantages including (i) the use of fewer training samples to achieve a desired level of convergence, (ii) a substantial reduction in training time, (iii) a seamless integration of our implementation into existing symbolic gradient frameworks, and (iv) assurance of convergence to first order stationary points. It is noteworthy that the algorithms achieve this, especially point (iv), despite being asked to optimize complex objective functions. We implement our techniques on a variety of deep architectures including multi-layer perceptrons and recurrent neural networks and show that on a variety of benchmark and real data sets, our algorithms outperform traditional approaches to training deep networks, as well as popular techniques used to handle label imbalance.
引用
收藏
页码:1597 / 1620
页数:23
相关论文
共 50 条
  • [21] Implicit Rate-Constrained Optimization of Non-decomposable Objectives
    Kumar, Abhishek
    Narasimhan, Harikrishna
    Cotter, Andrew
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [22] CONSTRUCTION OF NON-DECOMPOSABLE POSITIVE DEFINITE UNIMODULAR QUADRATIC FORMS
    朱福祖
    Science China Mathematics, 1987, (01) : 19 - 31
  • [23] Experiencing Megaco protocol for controlling non-decomposable VoIP gateways
    Conte, A
    Anquetil, LP
    Levy, T
    IEEE INTERNATIONAL CONFERENCE ON NETWORKS 2000 (ICON 2000), PROCEEDINGS: NETWORKING TRENDS AND CHALLENGES IN THE NEW MILLENNIUM, 2000, : 105 - 111
  • [24] ExactBoost: Directly Boosting the Margin in Combinatorial and Non-decomposable Metrics
    Csillag, Aniel
    Piazza, Carolina
    Rarnos, Thiago
    Romano, Joao Vitor
    Oliveira, Roberto
    Orenstein, Paulo
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [25] Symmetrized non-decomposable approximations of the non-additive kinetic energy functional
    Polak, Elias
    Englert, Tanguy
    Gander, Martin J.
    Wesolowski, Tomasz A.
    JOURNAL OF CHEMICAL PHYSICS, 2023, 158 (17):
  • [26] Training Over-parameterized Models with Non-decomposable Objectives
    Narasimhan, Harikrishna
    Menon, Aditya Krishna
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [27] Migration of non-decomposable software systems to the Web using screen proxies
    Bodhuin, T
    Guardabascio, E
    Tortorella, M
    10TH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, 2003, : 165 - 174
  • [28] Simulation of hyper-inverse Wishart distributions for non-decomposable graphs
    Wang, Hao
    Carvalho, Carlos M.
    ELECTRONIC JOURNAL OF STATISTICS, 2010, 4 : 1470 - 1475
  • [29] CONSTRUCTION OF NON-DECOMPOSABLE POSITIVE DEFINITE UNIMODULAR QUADRATIC-FORMS
    ZHU, FZ
    SCIENTIA SINICA SERIES A-MATHEMATICAL PHYSICAL ASTRONOMICAL & TECHNICAL SCIENCES, 1987, 30 (01): : 19 - 31
  • [30] The Isserlis matrix and its application to non-decomposable graphical Gaussian models
    Roverato, A
    Whittaker, J
    BIOMETRIKA, 1998, 85 (03) : 711 - 725