Selection of breast features for young women in northwestern China based on the random forest algorithm

被引:12
|
作者
Zhou, Jie [1 ]
Mao, Qian [1 ]
Zhang, Jun [2 ]
Lau, Newman M. L. [2 ]
Chen, Jianming [3 ]
机构
[1] Xian Polytech Univ, Sch Apparel & Art Design, 19 Jinhua South Rd, Xian 710048, Shaanxi, Peoples R China
[2] Hong Kong Polytech Univ, Sch Design, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Dept Biomed Engn, Hong Kong, Peoples R China
关键词
Breast shape classification; random forest algorithm; feature selection; breast shape recognition; K-MEANS; BRA; SHAPE; DIMENSIONS; SUPPORT;
D O I
10.1177/00405175211040869
中图分类号
TB3 [工程材料学]; TS1 [纺织工业、染整工业];
学科分类号
0805 ; 080502 ; 0821 ;
摘要
In the research of breast morphology, numerous breast features are measured, whereas only a few parameters are adopted for classification. Therefore, how to extract the key variables from the multi-dimensional features in a rational way is an issue that is focused upon. This study aimed to reduce the complexity of the dimensionality reduction for further improving the objectivity and interpretability of the selected breast features. Since the random forest (RF) algorithm can quantify the feature importance during training, the method was adopted to determine the optimal breast features for classification and recognition in this paper. Firstly, the anthropometric data of 360 females from northwestern China aged from 19 to 27 years were measured by non-contact three-dimensional body scanning technology and the contact manual measurement method. Then, the k-means clustering was applied to categorize breast shapes, and the RF algorithm was utilized to quantify and rank the importance of 25 breast features. Finally, to verify the availability of the RF algorithm on breast feature selection, the t-distributed stochastic neighbor embedding method was adopted to visualize the distribution of breast shape clusters into two dimensions. Meanwhile, four neural networks were determined to recognize the breast morphology. The results demonstrate that fewer breast features can effectively increase the accuracy of breast shape classification and recognition. The best performance of breast shape classification and recognition is obtained when the number of breast features is 13. In this case, the average Hamming loss of four neural networks is the smallest (0.1136). Interestingly, the bust circumference and the horizontal curve of breasts across the bust points are found to be the most important of the 25 breast features in this paper. The importance of the breast curve features is higher than that of the breast cross-sectional features, while the breast positioning features have the lowest importance. Meanwhile, the RF algorithm is verified to be more effective than traditional dimensionality reduction methods, such as principal component analysis, hierarchical clustering, and recursive feature elimination. The approach developed in this paper can be generalized to the dimensionality reduction of other body morphology.
引用
收藏
页码:957 / 973
页数:17
相关论文
共 50 条
  • [11] Using Random Forest Algorithm for Breast Cancer Diagnosis
    Dai, Bin
    Chen, Rung-Ching
    Zhu, Shun-Zhi
    Zhang, Wei-Wei
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 449 - 452
  • [12] Melanoma important features selection using random forest approach
    Paja, Wieslaw
    Wrzesien, Mariusz
    2013 6TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTIONS (HSI), 2013, : 415 - 418
  • [13] A Novel Hybrid Gene Selection Based on Random Forest Approach and Binary Dragonfly Algorithm
    Boroujeni, Sayed Pedram Haeri
    Pashaei, Elnaz
    2021 18TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE 2021), 2021,
  • [14] Random Forest Winter Wheat Extraction Algorithm Based on Spatial Features of Neighborhood Samples
    Wang, Nayi
    Fan, Xiangsuo
    Fan, Jinlong
    Yan, Chuan
    MATHEMATICS, 2022, 10 (13)
  • [15] Three-way selection random forest algorithm based on decision boundary entropy
    Zhang, Chunying
    Ren, Jing
    Liu, Fengchun
    Li, Xiaoqi
    Liu, Shouyue
    APPLIED INTELLIGENCE, 2022, 52 (12) : 13384 - 13397
  • [16] A Study of Accounting Teaching Feature Selection and Importance Assessment Based on Random Forest Algorithm
    Hu, Jing
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [17] The main features of Breast Cancer in young women
    Valiyeva, V.
    Akhundova, J.
    Garaisayeva, S.
    Inara, J.
    Aliyeva, K.
    Abasli, A.
    BREAST, 2025, 80
  • [18] Room Occupancy Detection Based on Random Forest with Timestamp Features and ANOVA Feature Selection Method
    Alam S.
    Sari R.M.
    Alfian G.
    Farooq U.
    J. Comput. Sci. Eng., 2024, 1 (10-18): : 10 - 18
  • [19] Water poverty assessment based on the random forest algorithm: application to Gansu, Northwest China
    Gao, Xiang
    Wang, Ke
    Lo, Kevin
    Wen, Ruiyang
    Huang, Xingxing
    Dang, Qianwen
    WATER POLICY, 2021, 23 (06) : 1388 - 1399
  • [20] Study on Biomass Models of Artificial Young Forest in the Northwestern Alpine Region of China
    Mao, Chunyan
    Yi, Lubei
    Xu, Wenqiang
    Dai, Li
    Bao, Anming
    Wang, Zhengyu
    Zheng, Xueting
    FORESTS, 2022, 13 (11):