Mining user privacy concern topics from app reviews

被引:0
|
作者
Zhang, Jianzhang [1 ]
Zhou, Jialong [1 ]
Hua, Jinping [2 ]
Niu, Nan [3 ]
Liu, Chuang [1 ]
机构
[1] Hangzhou Normal Univ, Dept Management Sci & Engn, Hangzhou, Zhejiang, Peoples R China
[2] Jiangxi Prov Inst Cyber Secur, Nanchang, Jiangxi, Peoples R China
[3] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH 45221 USA
基金
中国国家自然科学基金;
关键词
Privacy concerns; Topic modeling; App reviews mining; Privacy requirements; Requirements engineering; MOBILE APPS; REQUIREMENTS; PERCEPTION; TAXONOMY;
D O I
10.1016/j.jss.2025.112355
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: As mobile applications (apps) widely spread throughout our society and daily life, various personal information is constantly demanded by apps in exchange for more intelligent and customized functionality. An increasing number of users are voicing their privacy concerns through app reviews on app stores. Objective: The main challenge of effectively mining privacy concerns from user reviews lies in that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge. Method: Our approach first employs information retrieval and document embeddings to extract candidate privacy reviews in an unsupervised manner, which are further labeled to prepare the annotation dataset. Then, supervised classifiers are trained to automatically identify privacy reviews. Finally, an interpretable topic mining algorithm is designed to detect privacy concern topics contained in the privacy reviews. Results: Experimental results show that the best performing document embedding achieves an average precision of 96.80% in the top 100 retrieved candidate privacy reviews, outperforming the taxonomy-based baseline, which achieves 73.87%. All trained privacy review classifiers achieve an F1 score above 91%, surpassing the keyword-matching baseline by as much as 7.5% and the large language model baseline by up to 2.74%. For detecting privacy concern topics from privacy reviews, our proposed algorithm achieves both better topic coherence and topic diversity than three strong topic modeling baselines, including LDA. Conclusion: Empirical evaluation results demonstrate the effectiveness of our approach in identifying privacy reviews and detecting user privacy concerns in app reviews.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Mining Changes in User Expectation Over Time From Online Reviews
    Hou, Tianjun
    Yannou, Bernard
    Leroy, Yann
    Poirson, Emilie
    JOURNAL OF MECHANICAL DESIGN, 2019, 141 (09)
  • [22] A probabilistic rating inference framework for mining user preferences from reviews
    Cane Wing-ki Leung
    Stephen Chi-fai Chan
    Fu-lai Chung
    Grace Ngai
    World Wide Web, 2011, 14 : 187 - 215
  • [23] A probabilistic rating inference framework for mining user preferences from reviews
    Leung, Cane Wing-ki
    Chan, Stephen Chi-fai
    Chung, Fu-lai
    Ngai, Grace
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2011, 14 (02): : 187 - 215
  • [24] Updating the goal model with user reviews for the evolution of an app
    Gao, Shanquan
    Liu, Lei
    Liu, Yuzhou
    Liu, Huaxiao
    Wang, Yihui
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2020, 32 (08)
  • [25] Spreading Word: Author Frequency of App User Reviews
    Hoon, Leonard
    Stojmenovic, Milica
    Vasa, Raj
    Farrell, Graham
    PROCEEDINGS OF THE 28TH AUSTRALIAN COMPUTER-HUMAN INTERACTION CONFERENCE (OZCHI 2016), 2016,
  • [26] ROSEMATCHER: Identifying the impact of user reviews on app updates
    Liu, Tianyang
    Wang, Chong
    Huang, Kun
    Liang, Peng
    Zhang, Beiqi
    Daneva, Maya
    van Sinderen, Marten
    INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 161
  • [27] A large scale analysis of mHealth app user reviews
    Omar Haggag
    John Grundy
    Mohamed Abdelrazek
    Sherif Haggag
    Empirical Software Engineering, 2022, 27
  • [28] Extracting Arguments Based on User Decisions in App Reviews
    Kunaefi, Anang
    Aritsugi, Masayoshi
    IEEE ACCESS, 2021, 9 : 45078 - 45094
  • [29] Mobile App Evolution Analysis Based on User Reviews
    Li, Xiaozhou
    Zhang, Zheying
    Stefanidis, Kostas
    NEW TRENDS IN INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES (SOMET_18), 2018, 303 : 773 - 786
  • [30] A large scale analysis of mHealth app user reviews
    Haggag, Omar
    Grundy, John
    Abdelrazek, Mohamed
    Haggag, Sherif
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (07)