A Topological Machine Learning Pipeline for Classification

被引:6
|
作者
Conti, Francesco [1 ,2 ]
Moroni, Davide [2 ]
Pascali, Maria Antonietta [2 ]
机构
[1] Univ Pisa, Dept Math, I-56126 Pisa, Italy
[2] Natl Res Council Italy CNR, Inst Informat Sci & Technol A Faedo, I-56124 Pisa, Italy
关键词
topological machine learning; persistent homology; classification; vectorization; SIZE FUNCTIONS; REGRESSION; SELECTION;
D O I
10.3390/math10173086
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this work, we develop a pipeline that associates Persistence Diagrams to digital data via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. The development of such a topological pipeline for Machine Learning involves two crucial steps that strongly affect its performance: firstly, digital data must be represented as an algebraic object with a proper associated filtration in order to compute its topological summary, the Persistence Diagram. Secondly, the persistence diagram must be transformed with suitable representation methods in order to be introduced in a Machine Learning algorithm. We assess the performance of our pipeline, and in parallel, we compare the different representation methods on popular benchmark datasets. This work is a first step toward both an easy and ready-to-use pipeline for data classification using persistent homology and Machine Learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] Machine Learning Pipeline for Online Shopper Intention Classification
    Hamami, Faqih
    Muzakki, Ahmad
    INTERNATIONAL CONFERENCE ON MATHEMATICS, COMPUTATIONAL SCIENCES AND STATISTICS 2020, 2021, 2329
  • [2] Exploring Classification of Topological Priors With Machine Learning for Feature Extraction
    Leventhal, Samuel
    Gyulassy, Attila
    Heimann, Mark
    Pascucci, Valerio
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (07) : 3959 - 3972
  • [3] A pipeline and comparative study of 12 machine learning models for text classification
    Occhipinti, Annalisa
    Rogers, Louis
    Angione, Claudio
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 201
  • [4] Implementation and empirical evaluation of a quantum machine learning pipeline for local classification
    Zardini, Enrico
    Blanzieri, Enrico
    Pastorello, Davide
    PLOS ONE, 2023, 18 (11):
  • [5] Chatter Classification in Turning using Machine Learning and Topological Data Analysis
    Khasawneh, Firas A.
    Munch, Elizabeth
    Perea, Jose A.
    IFAC PAPERSONLINE, 2018, 51 (14): : 195 - 200
  • [6] mAML: an automated machine learning pipeline with a microbiome repository for human disease classification
    Yang, Fenglong
    Zou, Quan
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2020,
  • [7] DeepEthogram, a machine learning pipeline for supervised behavior classification from raw pixels
    Bohnslav, James P.
    Wimalasena, Nivanthika K.
    Clausing, Kelsey J.
    Dai, Yu Y.
    Yarmolinsky, David A.
    Cruz, Tomas
    Kashlan, Adam D.
    Chiappe, M. Eugenia
    Orefice, Lauren L.
    Woolf, Clifford J.
    Harvey, Christopher D.
    ELIFE, 2021, 10
  • [8] The "Idealized Machine Learning Pipeline" for Advancing Reproducibility in Machine Learning
    Zheng, Yantong
    Stodden, Victoria
    PROCEEDINGS OF THE 2ND ACM CONFERENCE ON REPRODUCIBILITY AND REPLICABILITY, ACM REP 2024, 2024, : 110 - 120
  • [9] Machine learning topological states
    Deng, Dong-Ling
    Li, Xiaopeng
    Das Sarma, S.
    PHYSICAL REVIEW B, 2017, 96 (19)
  • [10] A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets
    Frasier, Kaitlin E.
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (12)