Selection of breast features for young women in northwestern China based on the random forest algorithm

被引:12
|
作者
Zhou, Jie [1 ]
Mao, Qian [1 ]
Zhang, Jun [2 ]
Lau, Newman M. L. [2 ]
Chen, Jianming [3 ]
机构
[1] Xian Polytech Univ, Sch Apparel & Art Design, 19 Jinhua South Rd, Xian 710048, Shaanxi, Peoples R China
[2] Hong Kong Polytech Univ, Sch Design, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Dept Biomed Engn, Hong Kong, Peoples R China
关键词
Breast shape classification; random forest algorithm; feature selection; breast shape recognition; K-MEANS; BRA; SHAPE; DIMENSIONS; SUPPORT;
D O I
10.1177/00405175211040869
中图分类号
TB3 [工程材料学]; TS1 [纺织工业、染整工业];
学科分类号
0805 ; 080502 ; 0821 ;
摘要
In the research of breast morphology, numerous breast features are measured, whereas only a few parameters are adopted for classification. Therefore, how to extract the key variables from the multi-dimensional features in a rational way is an issue that is focused upon. This study aimed to reduce the complexity of the dimensionality reduction for further improving the objectivity and interpretability of the selected breast features. Since the random forest (RF) algorithm can quantify the feature importance during training, the method was adopted to determine the optimal breast features for classification and recognition in this paper. Firstly, the anthropometric data of 360 females from northwestern China aged from 19 to 27 years were measured by non-contact three-dimensional body scanning technology and the contact manual measurement method. Then, the k-means clustering was applied to categorize breast shapes, and the RF algorithm was utilized to quantify and rank the importance of 25 breast features. Finally, to verify the availability of the RF algorithm on breast feature selection, the t-distributed stochastic neighbor embedding method was adopted to visualize the distribution of breast shape clusters into two dimensions. Meanwhile, four neural networks were determined to recognize the breast morphology. The results demonstrate that fewer breast features can effectively increase the accuracy of breast shape classification and recognition. The best performance of breast shape classification and recognition is obtained when the number of breast features is 13. In this case, the average Hamming loss of four neural networks is the smallest (0.1136). Interestingly, the bust circumference and the horizontal curve of breasts across the bust points are found to be the most important of the 25 breast features in this paper. The importance of the breast curve features is higher than that of the breast cross-sectional features, while the breast positioning features have the lowest importance. Meanwhile, the RF algorithm is verified to be more effective than traditional dimensionality reduction methods, such as principal component analysis, hierarchical clustering, and recursive feature elimination. The approach developed in this paper can be generalized to the dimensionality reduction of other body morphology.
引用
收藏
页码:957 / 973
页数:17
相关论文
共 50 条
  • [21] Space Transformation Based Random Forest Algorithm
    Guan, Xiaoqiang
    Wang, Wenjian
    Pang, Jifang
    Meng, Yinfeng
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (11): : 2485 - 2499
  • [22] An Improved Algorithm based on KNN and Random Forest
    Liang, Jun
    Liu, Qin
    Nie, Nuihua
    Zeng, Biqing
    Zhang, Zanbo
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [23] An Improved Feature Selection Method Based on Random Forest Algorithm for Wind Turbine Condition Monitoring
    Li, Guo
    Wang, Chensheng
    Zhang, Di
    Yang, Guang
    SENSORS, 2021, 21 (16)
  • [24] Transformer Fault Diagnosis Based on the Improved Sparrow Search Algorithm and Random Forest Feature Selection
    Chen, Xi
    Ji, Ning
    Qin, Xue
    Zhang, Mengmeng
    Chen, Xueming
    Jiang, Chenlu
    Tao, Kai
    2024 3RD INTERNATIONAL CONFERENCE ON ENERGY AND ELECTRICAL POWER SYSTEMS, ICEEPS 2024, 2024, : 1086 - 1091
  • [25] A Novel Gene Selection Algorithm for cancer identification based on Random Forest and Particle Swarm Optimization
    Pashaei, Elnaz
    Ozen, Mustafa
    Aydin, Nizamettin
    2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2015, : 67 - 72
  • [26] Analysis of English Writing Text Features Based on Random Forest and Logistic Regression Classification Algorithm
    Sun, Chuan
    Luo, Bo
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [27] An Intelligent Heartbeat Classification System Based on Attributable Features with AdaBoost plus Random Forest Algorithm
    Li, Runchuan
    Zhang, Wenzhi
    Shen, Shengya
    Yao, Jinliang
    Li, Bicao
    Zhou, Bing
    Chen, Gang
    Wang, Zongmin
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [28] RANDOM FOREST AND SUPPORT VECTOR MACHINE ON FEATURES SELECTION FOR REGRESSION ANALYSIS
    Dewi, Christine
    Chen, Rung-Ching
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (06): : 2027 - 2037
  • [29] RETRACTION: Three-way selection random forest algorithm based on decision boundary entropy
    Zhang, Chunying
    Ren, Jing
    Liu, Fengchun
    Li, Xiaoqi
    Liu, Shouyue
    APPLIED INTELLIGENCE, 2025, 55 (02)
  • [30] A Channel Selection Method for Event Related Potential Detection based on Random Forest and Genetic Algorithm
    Tang, Cong
    Xu, Tao
    Chen, Peng
    He, Yuebang
    Bezerianos, Anastasios
    Wang, Hongtao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5419 - 5424