Compression and machine learning: A new perspective on feature space vectors

被引:0
|
作者
Sculley, D. [1 ]
Brodley, Carla E. [1 ]
机构
[1] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The use of compression algorithms in machine learning tasks such as clustering and classification has appeared in a variety of fields, sometimes with the promise of reducing problems of explicit feature selection. The theoretical justification for such methods has been founded on an tipper bound on Kolmogorov complexity and an idealized information space. An alternate view shows compression algorithms implicitly map strings into implicit feature space vectors, and compression-based similarity measures compute similarity within these feature spaces. Thus, compression-based methods are not a "parameter free" magic bullet for feature selection and data representation, but are instead concrete similarity measures within defined feature spaces, and are therefore akin to explicit feature vector models used in standard machine learning algorithms. To underscore this point, we find theoretical and empirical connections between traditional machine learning vector models and compression, encouraging cross-fertilization in future work.
引用
收藏
页码:332 / +
页数:4
相关论文
共 50 条
  • [31] New feature Selection method based on neural network and machine learning
    Challita, Nicole
    Khalil, Mohamad
    Beauseroy, Pierre
    [J]. 2016 IEEE INTERNATIONAL MULTIDISCIPLINARY CONFERENCE ON ENGINEERING TECHNOLOGY (IMCET), 2016, : 81 - 84
  • [32] Enhancing Data Space Semantic Interoperability through Machine Learning: a Visionary Perspective
    Boukhers, Zeyd
    Lange, Christoph
    Beyan, Oya
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1462 - 1467
  • [33] Machine learning approaches for discrimination of Extracellular Matrix proteins using hybrid feature space
    Ali, Farman
    Hayat, Maqsood
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2016, 403 : 30 - 37
  • [34] Building feature space of extreme learning machine with sparse denoising stacked-autoencoder
    Cao, Le-le
    Huang, Wen-bing
    Sun, Fu-chun
    [J]. NEUROCOMPUTING, 2016, 174 : 60 - 71
  • [35] Fuzzy Relational Compression Applied on Feature Vectors for Infant Cry Recognition
    Fausto Reyes-Galaviz, Orion
    Alberto Reyes-Garcia, Carlos
    [J]. MICAI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5845 : 420 - +
  • [36] Sample compression, support vectors, and generalization in deep learning
    Snyder C.
    Vishwanath S.
    [J]. IEEE Journal on Selected Areas in Information Theory, 2020, 1 (01): : 106 - 120
  • [37] Physical Space Vectors for Permanent Magnet Synchronous Machine
    Tsuji, Mineo
    Hamasaki, Shin-ichi
    Del Pizzo, Andrea
    [J]. 2014 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS, ELECTRICAL DRIVES, AUTOMATION AND MOTION (SPEEDAM), 2014, : 589 - 594
  • [38] Learning to Hash Faces Using Large Feature Vectors
    dos Santos, Cassio E., Jr.
    Kijak, Ewa
    Gravier, Guillaume
    Schwartz, William Robson
    [J]. 2015 13TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2015,
  • [39] From the Perspective of Explainable Machine Learning: A Student Feature Selection Strategy Based on the Geometric Mean of Feature Importance and Robustness
    Zhou, Jiarui
    Cheng, Yanying
    Dai, Chengxiao
    [J]. PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 499 - 503
  • [40] Incremental learning of discriminant common vectors for feature extraction
    Lu, Gui-Fu
    Zou, Jian
    Wang, Yong
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2012, 218 (22) : 11269 - 11278