Twitter mining for ontology-based domain discovery incorporating machine learning

被引:51
|
作者
Abu-Salih, Bilal [1 ]
Wongthongtham, Pornpit [1 ]
Kit, Chan Yan [2 ]
机构
[1] Curtin Univ, Perth, WA, Australia
[2] Curtin Univ, Dept Elect & Comp Engn, Perth, WA, Australia
关键词
Ontology; Machine learning; Twitter mining; Domain discovery; Domain-based trustworthiness; SOCIAL MEDIA; INNOVATION; NETWORKS; LINKING; SMES;
D O I
10.1108/JKM-11-2016-0489
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users' domain (s) of interest is a significant step towards addressing their domain-based trustworthiness through an accurate understanding of their content in their OSNs. Design/methodology/approach This study uses a Twitter mining approach for domain-based classification of users and their textual content. The proposed approach incorporates machine learning modules. The approach comprises two analysis phases: the time-aware semantic analysis of users' historical content incorporating five commonly used machine learning classifiers. This framework classifies users into two main categories: politics-related and non-politics-related categories. In the second stage, the likelihood predictions obtained in the first phase will be used to predict the domain of future users' tweets. Findings Experiments have been conducted to validate the mechanism proposed in the study framework, further supported by the excellent performance of the harnessed evaluation metrics. The experiments conducted verify the applicability of the framework to an effective domain-based classification for Twitter users and their content, as evident in the outstanding results of several performance evaluation metrics. Research limitations/implications This study is limited to an on/off domain classification for content of OSNs. Hence, we have selected a politics domain because of Twitter's popularity as an opulent source of political deliberations. Such data abundance facilitates data aggregation and improves the results of the data analysis. Furthermore, the currently implemented machine learning approaches assume that uncertainty and incompleteness do not affect the accuracy of the Twitter classification. In fact, data uncertainty and incompleteness may exist. In the future, the authors will formulate the data uncertainty and incompleteness into fuzzy numbers which can be used to address imprecise, uncertain and vague data. Practical implications This study proposes a practical framework comprising significant implications for a variety of business-related applications, such as the voice of customer/voice of market, recommendation systems, the discovery of domain-based influencers and opinion mining through tracking and simulation. In particular, the factual grasp of the domains of interest extracted at the user level or post level enhances the customer-to-business engagement. This contributes to an accurate analysis of customer reviews and opinions to improve brand loyalty, customer service, etc. Originality/value This paper fills a gap in the existing literature by presenting a consolidated framework for Twitter mining that aims to uncover the deficiency of the current state-of-the-art approaches to topic distillation and domain discovery. The overall approach is promising in the fortification of Twitter mining towards a better understanding of users' domains of interest.
引用
收藏
页码:949 / 981
页数:33
相关论文
共 50 条
  • [1] An Ontology-based Data Mining Framework in Traffic Domain
    Wang, Ruguang
    Dai, Weidi
    Cheng, Jieru
    [J]. FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE II, PTS 1-6, 2012, 121-126 : 55 - 59
  • [2] A domain ontology-based navigation learning system
    Zheng, Qinghua
    Wang, Yanye
    Huang, Zhibin
    Tian, Feng
    [J]. PROCEEDINGS OF THE 2008 12TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOLS I AND II, 2008, : 1065 - +
  • [3] An ontology-based approach for preprocessing in machine learning
    Soto, Patricia Centeno
    Ramzy, Nour
    Ocker, Felix
    Vogel-Heuser, Birgit
    [J]. INES 2021: 2021 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS, 2021,
  • [4] An ontology-based mining approach for user search intent discovery
    Shen, Yan
    Li, Yuefeng
    Xu, Yue
    Iannella, Renato
    Algarni, Abdulmohsen
    Tao, Xiaohui
    [J]. ADCS 2011 - Proceedings of the Sixteenth Australasian Document Computing Symposium, 2011, : 39 - 46
  • [5] Ontology-Based Categorization of Web Services with Machine Learning
    Funk, Adam
    Bontcheva, Kalina
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [6] OnML: an ontology-based approach for interpretable machine learning
    Pelin Ayranci
    Phung Lai
    Nhathai Phan
    Han Hu
    Alexander Kolinowski
    David Newman
    Deijing Dou
    [J]. Journal of Combinatorial Optimization, 2022, 44 : 770 - 793
  • [7] Ontology-based Interpretable Machine Learning for Textual Data
    Lai, Phung
    Phan, NhatHai
    Hu, Han
    Badeti, Anuja
    Newman, David
    Dou, Dejing
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [8] Ontology-based Recommender for Distributed Machine Learning Environment
    Pop, Daniel
    Bogdanescu, Caius
    [J]. 2013 15TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2013), 2014, : 537 - 542
  • [9] Machine learning techniques for ontology-based leaf classification
    Fu, H
    Chi, ZR
    Feng, DG
    Song, JT
    [J]. 2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 681 - 686
  • [10] OnML: an ontology-based approach for interpretable machine learning
    Ayranci, Pelin
    Lai, Phung
    Phan, Nhathai
    Hu, Han
    Kolinowski, Alexander
    Newman, David
    Dou, Deijing
    [J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2022, 44 (01) : 770 - 793