Adaptive lasso with weights based on normalized filtering scores in molecular big data

被引:3
|
作者
Patil, Abhijeet R. [1 ]
Park, Byung-Kwon [2 ]
Kim, Sangjin [2 ]
机构
[1] Univ Texas El Paso, Computat Sci, El Paso, TX 79968 USA
[2] Dong A Univ, Dept Management Informat Syst, Busan 49236, South Korea
来源
关键词
Adaptive lasso; feature ranking; sure independence screening; accuracy; geometric mean; LOGISTIC-REGRESSION; VARIABLE SELECTION; RIDGE REGRESSION; SHRINKAGE;
D O I
10.1142/S0219633620400106
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The molecular big data are highly correlated, and numerous genes are not related. The various classification methods performance mainly rely on the selection of significant genes. Sparse regularized regression (SRR) models using the least absolute shrinkage and selection operator (lasso) and adaptive lasso (alasso) are popularly used for gene selection and classification. Nevertheless, it becomes challenging when the genes are highly correlated. Here, we propose a modified adaptive lasso with weights using the ranking-based feature selection (RFS) methods capable of dealing with the highly correlated gene expression data. Firstly, an RFS methods such as Fisher's score (FS), Chi-square (CS), and information gain (IG) are employed to ignore the unimportant genes and the top significant genes are chosen through sure independence screening (SIS) criteria. The scores of the ranked genes are normalized and assigned as proposed weights to the alasso method to obtain the most significant genes that were proven to be biologically related to the cancer type and helped in attaining higher classification performance. With the synthetic data and real application of microarray data, we demonstrated that the proposed alasso method with RFS methods is a better approach than the other known methods such as alasso with filtering such as ridge and marginal maximum likelihood estimation (MMLE), lasso and alasso without filtering. The metrics of accuracy, area under the receiver operating characteristics curve (AUROC), and geometric mean (GM-mean) are used for evaluating the performance of the models.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Research on big data mining based on improved parallel collaborative filtering algorithm
    Zhu, Li
    Li, Heng
    Feng, Yuxuan
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02): : S3595 - S3604
  • [32] Design and Analysis of a Recommendation System Based on Collaborative Filtering Techniques for Big Data
    Khouibiri N.
    Farhaoui Y.
    El Allaoui A.
    Intelligent and Converged Networks, 2023, 4 (04): : 296 - 304
  • [33] Towards ontology-based multilingual URL filtering: a big data problem
    Mubashar Hussain
    Mansoor Ahmed
    Hasan Ali Khattak
    Muhammad Imran
    Abid Khan
    Sadia Din
    Awais Ahmad
    Gwanggil Jeon
    Alavalapati Goutham Reddy
    The Journal of Supercomputing, 2018, 74 : 5003 - 5021
  • [34] Terrain classification based on adaptive weights with airborne LiDAR data for mining area
    LI Huiying WANG Zhi SUN Yafeng LI Wenhui College of Computer Science and Technology Jilin University Changchun China College of Resources and Civil Engineering Northeastern University Shenyang China
    Transactions of Nonferrous Metals Society of China, 2011, 21(S3) (S3) : 648 - 653
  • [35] Terrain classification based on adaptive weights with airborne LiDAR data for mining area
    LI Hui-ying1
    2. College of Resources and Civil Engineering
    Transactions of Nonferrous Metals Society of China, 2011, 21 (S3) : 648 - 653
  • [36] Terrain classification based on adaptive weights with airborne LiDAR data for mining area
    Li Hui-ying
    Wang Zhi
    Sun Ya-feng
    Li Wen-hui
    TRANSACTIONS OF NONFERROUS METALS SOCIETY OF CHINA, 2011, 21 : S648 - S653
  • [37] Multi-Class Imbalance Classification Based on Data Distribution and Adaptive Weights
    Li, Shuxian
    Song, Liyan
    Wu, Xiaoyu
    Hu, Zheng
    Cheung, Yiu-ming
    Yao, Xin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (10) : 5265 - 5279
  • [38] Adaptive filtering of fMRI data based on correlation and bold response similarity
    Rydell, J.
    Knutsson, H.
    Borga, M.
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 2245 - 2248
  • [39] Autofocusing of moving objects in SAR data based on adaptive notch filtering
    Dragoksevic, Marina V.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2008, 44 (01) : 384 - 392
  • [40] Equation-error adaptive IIR filtering based on data reuse
    Kwon, Jun-Chan
    Choi, Young-Seok
    Song, Woo-Jin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2007, 54 (08) : 695 - 699