Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining

被引:0
|
作者
Afuan, Lasmedi [1 ]
Ashari, Ahmad [2 ]
Suyanto, Yohanes [2 ]
机构
[1] Univ Jenderal Soedirman, Dept Informat, Purwokerto, Central Java, Indonesia
[2] Univ Gadjah Mada, Dept Comp Sci & Elect, Yogyakarta, Indonesia
关键词
IR; query expansion; association rules; support; confidence; recall; precision;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Documents on the Internet have increased in number exponentially; this has resulted in users having difficulty finding documents or information needed. Special techniques are needed to retrieve documents that are relevant to user queries. One technique that can be used is Information Retrieval (IR). IR is the process of finding data (generally documents) in the form of text that matches the information needed from a collection of documents stored on a computer. Problems that often appear on IRs are incorrect user queries; this is caused by user limitations in representing their needs in the query. Researchers have proposed various solutions to overcome these limitations, one of which is to use the Expansion Query (QE). Various methods that have been applied to QE include Ontology, Latent Semantic Indexing (LSI), Local Co-Occurrence, Relevance Feedback, Concept Based, WordNet / Synonym Mapping. However, these methods still have limitations, one of them in terms of displaying the connection or relevance of the appearance of words or phrases in the document collection. To overcome this limitation, in this study we have proposed an approach to QE using the FP-Growth algorithm for the search for frequent itemset and Association Rules (AR) on QE. In this study, we applied the use of AR to QE to display the relevance of the appearance of a word or term with another word or term in the collection of documents, where the term produced is used to perform QE on user queries. The main contribution in this study is the use of Association rules with FP-Growth in the collection of documents to look for the connection of the emergence of words, which is then used to expand the original query of users on IR. For the evaluation of QE performance, we use recall, precision, and f-measure. Based on the research that has been done, it can be concluded that the use of AR on QE can improve the relevance of the documents produced. This is indicated by the average recall, precision, and f-measure values produced at 94.44%, 89.98%, and 92.07%. After comparing the IR process without QE with IR using QE, an increase in recall value was 25.65%, precision was 1.93%, and F-Measure was 15.78%.
引用
收藏
页码:263 / 267
页数:5
相关论文
共 50 条
  • [1] Mining φ-Frequent Itemset Using FP-Tree
    李天瑞
    [J]. Railway Engineering Science, 2001, (01) : 67 - 74
  • [2] Infrequent Weighted Itemset Mining Using Frequent Pattern Growth
    Cagliero, Luca
    Garza, Paolo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (04) : 903 - 915
  • [3] Parallel Architecture for Implementation of Frequent Itemset Mining Using FP-Growth
    Tehreem, Amna
    Khawaja, Sajid Gul
    Akram, Muhammad Usman
    Khan, Shoab A.
    Ali, Muhammad
    [J]. 2017 INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2017, : 92 - 98
  • [4] An Incremental Interesting Maximal Frequent Itemset Mining Based on FP-Growth Algorithm
    Alsaeedi, Hussein A.
    Alhegami, Ahmed S.
    [J]. COMPLEXITY, 2022, 2022
  • [5] A Stochastic Algorithm of Frequent Set Search for Mining Association Rules
    Savulioniene, Loreta
    Sakalauskas, Leonidas
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2014, 43 (02): : 121 - 132
  • [6] Incremental association rule mining using promising frequent itemset algorithm
    Amornchewin, Ratchadaporn
    Kreesuradej, Worapoj
    [J]. 2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 780 - 784
  • [7] FP-NoSQL: An Efficient Frequent Itemset Mining Algorithm Using the FP-DB Approach
    Chee, Chin-Hoong
    Jaafar, Jafreezal
    Aziz, Izzatdin Abdul
    [J]. 2018 IEEE CONFERENCE ON BIG DATA AND ANALYTICS (ICBDA), 2018, : 80 - 86
  • [8] An improved frequent pattern growth method for mining association rules
    Lin, Ke-Chung
    Liao, I-En
    Chen, Zhi-Sheng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5154 - 5161
  • [9] Fast algorithms for frequent itemset mining using FP-trees
    Grahne, G
    Zhu, JF
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (10) : 1347 - 1362
  • [10] Probabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases
    Bernecker, Thomas
    Kriegel, Hans-Peter
    Renz, Matthias
    Verhein, Florian
    Zuefle, Andreas
    [J]. SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2012, 2012, 7338 : 38 - 55