Automatic discovery of the sequential accesses from web log data files via a genetic algorithm

被引:11
|
作者
Tug, Emine [1 ]
Sakiroglu, Merve [1 ]
Arslan, Ahmet [1 ]
机构
[1] Selcuk Univ, Dept Comp Sci, Konya 42300, Turkey
关键词
web mining; genetic algorithm; knowledge discovery; sequential access;
D O I
10.1016/j.knosys.2005.10.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with finding sequential accesses from web log files, using 'Genetic Algorithm' (GA). Web log files are independent from servers, and they are ASCII format. Each transaction, whether completed or not, is recorded in the web log files and these files are unstructured for knowledge discovery in database techniques. Data which is stored in web logs have become important for discovering of user behaviors since the using of internet increased rapidly. Analyzing of these log files is one of the important research area of web mining. Especially, with the advent of CRM (Customer Resource Management) issues in business circle, most of the modem firms operating web sites for several purposes are now adopting web-mining as a strategic way of capturing knowledge about potential needs of target customers, future trends in the market and other management factors. Our work (ALMG-Automatic Log Mining via Genetic) has mined web log files via genetic algorithm. When we search the studies about web mining in literature, it can be seen that, GA is generally used in web content and web structure mining. On the other hand, ALMG is a study about web mining usage. The difference between ALMG and other similar works at literature is this point. As for in another work that we are encountering, GA is used for processing the data between HTML tags which are placed at client PC. But ALMG extracts information from data which is placed at server. It is thought to use log files is an advantage for our purpose. Because, we find the character of requests which is made to the server than detect a single person's behavior. We developed an application with this purpose. Firstly, the application is analyzed web log files, than found sequential accessed page groups automatically. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:180 / 186
页数:7
相关论文
共 50 条
  • [31] Reliable Biomarker discovery from Metagenomic data via RegLRSD algorithm
    Mustafa Alshawaqfeh
    Ahmad Bashaireh
    Erchin Serpedin
    Jan Suchodolski
    BMC Bioinformatics, 18
  • [32] SePMa: An algorithm that mining sequential processes from hybrid log
    Huang, Xiaoyu
    Zhong, Huiling
    Cai, Wenxue
    EMERGING TECHNOLOGIES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2007, 4819 : 292 - +
  • [33] Knowledge discovery from web usage data: Extraction and applications of sequential and clustering patterns - A survey
    Raju, G. T.
    Satyanarayana, P. S.
    Patnaik, L. M.
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (02): : 381 - 389
  • [34] Automatic Discovery of Personal Name Aliases from the Web
    Bollegala, Danushka
    Matsuo, Yutaka
    Ishizuka, Mitsuru
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (06) : 831 - 844
  • [35] Automatic discovery of attribute words from Web documents
    Tokunaga, K
    Kazama, J
    Torisawa, K
    NATURAL LANGUAGE PROCESSING - IJCNLP 2005, PROCEEDINGS, 2005, 3651 : 106 - 118
  • [36] Obtaining subject data from log files using deep log analysis: case study OhioLINK
    Huntington, Paul
    Nicholas, David
    Jamali, Hamid R.
    Watkinson, Anthony
    JOURNAL OF INFORMATION SCIENCE, 2006, 32 (04) : 299 - 308
  • [37] Optimisation of automatic web services composition using genetic algorithm
    Shirvani M.H.
    Gorji A.B.
    International Journal of Cloud Computing, 2020, 9 (04) : 397 - 411
  • [38] MULTILAYER PERCEPTRON WITH GENETIC ALGORITHM FOR WELL LOG DATA INVERSION
    Huang, Kou-Yuan
    Shen, Liang-Chi
    Chen, Kai-Ju
    Huang, Ming-Che
    2013 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2013, : 1544 - 1547
  • [39] webSPADE: A parallel sequence mining algorithm to analyze web log data
    Demiriz, A
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 755 - 758
  • [40] Business Protocol Discovery from Log Files using a TF-IDF-based Technique
    Moudjari, Abdelkader
    Chikhi, Salim
    Draa, Amer
    2015 SEVENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS, 2015, : 651 - 656