Big Text advantages and challenges: classification perspective

被引:12
|
作者
Sokolova M. [1 ]
机构
[1] School of Epidemiology and Public Health, 308E-600 Peter Morand Cres, Ottawa, K1G 5Z3, ON
关键词
Big text; Classification; Machine learning; Performance evaluation;
D O I
10.1007/s41060-017-0087-5
中图分类号
学科分类号
摘要
Big Text, i.e., large repositories of textual data, is a part of Big Data. In total, 80–85 % of Big Text comes in unstructured form, with significant contribution from social media. In this position paper, we discuss Big Text advantages and challenges in respect to text classification. We propose a new approach to performance evaluation of classification algorithms when they applied to Big Text, namely, using corpora comparison in the result evaluation. We also discuss a significant increase in texts with comprehensive information and challenges Big Text methods face in analysis of such texts. © 2017, Springer International Publishing AG, part of Springer Nature.
引用
收藏
页码:1 / 10
页数:9
相关论文
共 50 条
  • [1] Automated Big Text Security Classification
    Alzhrani, Khudran
    Rudd, Ethan M.
    Boult, Terrance E.
    Chow, C. Edward
    [J]. IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS: CYBERSECURITY AND BIG DATA, 2016, : 103 - 108
  • [2] The advantages of an ADHD classification from the perspective of teachers
    Wienen, Albert W.
    Sluiter, Maruschka N.
    Thoutenhoofd, Ernst
    de Jonge, Peter
    Batstra, Laura
    [J]. EUROPEAN JOURNAL OF SPECIAL NEEDS EDUCATION, 2019, 34 (05) : 649 - 662
  • [3] Automated Big Security Text Pruning and Classification
    Alzhrani, Khudran
    Rudd, Ethan M.
    Chow, C. Edward
    Boult, Terrance E.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3629 - 3637
  • [4] A paper-text perspective Studies on the influence of feature granularity for Chinese short-text-classification in the Big Data era
    Wang, Hao
    Deng, Sanhong
    [J]. ELECTRONIC LIBRARY, 2017, 35 (04): : 689 - 708
  • [5] Large marine protected areas - advantages and challenges of going big
    Wilhelm, T. 'Aulani
    Sheppard, Charles R. C.
    Sheppard, Anne L. S.
    Gaymer, Carlos F.
    Parks, John
    Wagner, Daniel
    Lewis, Nai'a
    [J]. AQUATIC CONSERVATION-MARINE AND FRESHWATER ECOSYSTEMS, 2014, 24 : 24 - 30
  • [6] The applications, advantages and challenges in the implementation of HRIS in Pakistani perspective
    Faculty of Management Science, Khadim Ali Shah Bukhari Institute of Technology, Karachi, Pakistan
    [J]. VINE J. Inform. Knowl. Manag., 1 (137-150):
  • [7] Perspective of open data application in Kosovo, challenges, and advantages
    Hyseni, Besart
    Bexheti, Lejla Abazi
    [J]. 2022 11TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2022, : 209 - 216
  • [8] Big Data Challenges A Program Optimization Perspective
    Kejariwal, Arun
    [J]. SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 702 - 707
  • [9] Big Data: An Institutional Perspective on Opportunities and Challenges
    Hasnat, Baban
    [J]. JOURNAL OF ECONOMIC ISSUES, 2018, 52 (02) : 580 - 588
  • [10] Immersive Journalism: Advantages, Disadvantages and Challenges from the Perspective of Experts
    Damas, Susana Herrera
    de Gracia, Maria Jose Benitez
    [J]. JOURNALISM AND MEDIA, 2022, 3 (02): : 330 - 347