Data-Driven Imputation of Miscibility of Aqueous Solutions via Graph-Regularized Logistic Matrix Factorization

被引:3
|
作者
Behnoudfar, Diba [1 ]
Simon, Cory M. [1 ]
Schrier, Joshua [2 ]
机构
[1] Oregon State Univ, Sch Chem Biol & Environm Engn, Corvallis, OR 97331 USA
[2] Fordham Univ, Dept Chem, The Bronx, NY 10458 USA
来源
JOURNAL OF PHYSICAL CHEMISTRY B | 2023年 / 127卷 / 37期
基金
美国国家科学基金会;
关键词
2-PHASE SYSTEMS; PHASE-SEPARATION; PREDICTION; COEFFICIENTS;
D O I
10.1021/acs.jpcb.3c03789
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Aqueous, two-phase systems (ATPSs) may form upon mixing two solutions of independently water-soluble compounds. Many separation, purification, and extraction processes rely on ATPSs. Predicting the miscibility of solutions can accelerate and reduce the cost of the discovery of new ATPSs for these applications. Whereas previous machine learning approaches to ATPS prediction used physicochemical properties of each solute as a descriptor, in this work, we show how to impute missing miscibility outcomes directly from an incomplete collection of pairwise miscibility experiments. We use graph-regularized logistic matrix factorization (GR-LMF) to learn a latent vector of each solution from (i) the observed entries in the pairwise miscibility matrix and (ii) a graph where each node is a solution and edges are relationships indicating the general category of the solute (i.e., polymer, surfactant, salt, protein). For an experimental data set of the pairwise miscibility of 68 solutions from Peacock et al. [ACS Appl. Mater. Interfaces 2021, 13, 11449-11460], we find that GR-LMF more accurately predicts missing (im)miscibility outcomes of pairs of solutions than ordinary logistic matrix factorization and random forest classifiers that use physicochemical features of the solutes. GR-LMF obviates the need for features of the solutions and solutions to impute missing miscibility outcomes, but it cannot predict the miscibility of a new solution without some observations of its miscibility with other solutions in the training data set.
引用
收藏
页码:7964 / 7973
页数:10
相关论文
共 50 条
  • [21] Cluster Ensembles via Weighted Graph Regularized Nonnegative Matrix Factorization
    Du, Liang
    Li, Xuan
    Shen, Yi-Dong
    ADVANCED DATA MINING AND APPLICATIONS, PT I, 2011, 7120 : 215 - 228
  • [22] Adaptive Graph Regularized Nonnegative Matrix Factorization via Feature Selection
    Wang, Jing-Yan
    Almasri, Islam
    Gao, Xin
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 963 - 966
  • [23] Automatic detection for epileptic seizure using graph-regularized nonnegative matrix factorization and Bayesian linear discriminate analysis
    Mu, Jianwei
    Dai, Lingyun
    Liu, Jin-Xing
    Shang, Junliang
    Xu, Fangzhou
    Liu, Xiang
    Yuan, Shasha
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2021, 41 (04) : 1258 - 1271
  • [24] Adversarial Graph Regularized Deep Nonnegative Matrix Factorization for Data Representation
    Li, Songtao
    Li, Weigang
    Li, Yang
    IEEE ACCESS, 2022, 10 : 86445 - 86457
  • [25] Graph regularized nonnegative matrix factorization with label discrimination for data clustering
    Xing, Zhiwei
    Ma, Yingcang
    Yang, Xiaofei
    Nie, Feiping
    NEUROCOMPUTING, 2021, 440 : 297 - 309
  • [26] Dual Graph Regularized Sparse Nonnegative Matrix Factorization for Data Representation
    Peng, Siyuan
    Ser, Wee
    Lin, Zhiping
    Chen, Badong
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [27] Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover's Distance Metric
    Li, Shunli
    Lu, Linzhang
    Liu, Qilong
    Chen, Zhen
    MATHEMATICS, 2023, 11 (08)
  • [28] Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization
    Wen, Jie
    Zhang, Zheng
    Xu, Yong
    Zhong, Zuofeng
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 593 - 608
  • [29] Data Imputation Using a Trust Network for Recommendation via Matrix Factorization
    Hwang, Won-Seok
    Li, Shaoyu
    Kim, Sang-Wook
    Lee, Kichun
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2018, 15 (02) : 347 - 368
  • [30] Online Graph Regularized Non-negative Matrix Factorization for Streamming Data
    Liu, Fudong
    Guan, Naiyang
    Tang, Yuhua
    2014 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2014, : 191 - 196