A comparison of textual data mining methods for sex identification in chat conversations

被引:0
|
作者
Kose, Cemal [1 ]
Ozyurt, Ozcan [1 ]
Ikibas, Cevat [1 ]
机构
[1] Karadeniz Tech Univ, Fac Engn, Dept Comp Engn, TR-61080 Trabzon, Turkey
来源
关键词
mining chat conversations; sex identification; information extraction; text mining; machine learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining textual data in chat mediums is becoming more important because these mediums contain a vast amount of information, which is potentially relevant to a society's current interests, habits, social behaviors, crime tendency and other tendencies. Here, sex identification is taken as a base study in information mining in chat mediums. In order to do this, a simple discrimination function and semantic analysis method are proposed for sex identification in Turkish chat mediums. Then, the proposed sex identification method is compared with the Support Vector Machine (SVM) and Naive Bayes (NB) methods. Finally, results show that the proposed system has achieved accuracy over 90% in sex identification.
引用
收藏
页码:638 / 643
页数:6
相关论文
共 50 条
  • [21] A software infrastructure for research in textual data mining
    Holzman, LE
    Fisher, TA
    Galitsky, LM
    Kontostathis, A
    Pottenger, WM
    [J]. 15TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, : 112 - 121
  • [22] Mining Textual Data from Primary Healthcare Records - Automatic Identification of Patient Phenotype Cohorts
    Zhou, Shang-Ming
    Rahman, Muhammad A.
    Atkinson, Mark
    Brophy, Sinead
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 3621 - 3627
  • [23] Data mining for network intrusion detection: A comparison of alternative methods
    Zhu, D
    Premkumar, G
    Zhang, XN
    Chu, CH
    [J]. DECISION SCIENCES, 2001, 32 (04) : 635 - 660
  • [24] Real-Data Comparison of Data Mining Methods in Prediction of Diabetes in Iran
    Tapak, Lily
    Mahjub, Hossein
    Hamidi, Omid
    Poorolajal, Jalal
    [J]. HEALTHCARE INFORMATICS RESEARCH, 2013, 19 (03) : 177 - 185
  • [25] Identification of Food Quality Descriptors in Customer Chat Conversations using Named Entity Recognition
    Brahma, Aditya Kiran
    Potluri, Prathyush
    Kanapaneni, Meghana
    Prabhu, Sumanth
    Teki, Sundeep
    [J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 257 - 261
  • [26] Combining topic models and social networks for chat data mining
    Tuulos, VH
    Tirri, H
    [J]. IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 206 - 213
  • [27] Methods for data mining
    不详
    [J]. DATA MINING ON MULTIMEDIA DATA, 2002, 2558 : 23 - 89
  • [28] Textual Data Mining to Support Science and Technology Management
    Paul Losiewicz
    Douglas W. Oard
    Ronald N. Kostoff
    [J]. Journal of Intelligent Information Systems, 2000, 15 : 99 - 119
  • [29] Textual data mining to support science and technology management
    Losiewicz, P
    Oard, DW
    Kostoff, RN
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2000, 15 (02) : 99 - 119
  • [30] Mondou: Web search engine with textual data mining
    Kawano, H
    [J]. 1997 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2: PACRIM 10 YEARS - 1987-1997, 1997, : 402 - 405