Analyzing privacy policies through syntax-driven semantic analysis of information types

被引:8
|
作者
Hosseini, Mitra Bokaei [1 ]
Breaux, Travis D. [2 ]
Slavin, Rocky [3 ]
Niu, Jianwei [3 ]
Wang, Xiaoyin [3 ]
机构
[1] St Marys Univ, 1 Camino Santa Maria, San Antonio, TX 78228 USA
[2] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[3] Univ Texas San Antonio, 1 UTSA Circle, San Antonio, TX USA
基金
美国国家科学基金会;
关键词
Privacy policy; Ambiguity; Generality; Ontology; REQUIREMENTS; FRAMEWORK; AMBIGUITY; LANGUAGE;
D O I
10.1016/j.infsof.2021.106608
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Several government laws and app markets, such as Google Play, require the disclosure of app data practices to users. These data practices constitute critical privacy requirements statements, since they underpin the app's functionality while describing how various personal information types are collected, used, and with whom they are shared. Objective: Abstract and ambiguous terminology in requirements statements concerning information types (e.g., "we collect your device information"), can reduce shared understanding among app developers, policy writers, and users. Method: To address this challenge, we propose a syntax-driven method that first parses a given information type phrase (e.g. mobile device identifier) into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules. The inferred semantic relationships between a given phrase and its constituents generate a hierarchy that models the generality and ambiguity of phrases. Through this method, we infer relations from a lexicon consisting of a set of information type phrases to populate a partial ontology. The resulting ontology is a knowledge graph that can be used to guide requirements authors in the selection of the most appropriate information type terms. Results: We evaluate the method's performance using two criteria: (1) expert assessment of relations between information types; and (2) non-expert preferences for relations between information types. The results suggest performance improvement when compared to a previously proposed method. We also evaluate the reliability of the method considering the information types extracted from different data practices (e.g., collection, usage, sharing, etc.) in privacy policies for mobile or web-based apps in various app domains. Contributions: The method achieves average of 89% precision and 87% recall considering information types from various app domains and data practices. Due to these results, we conclude that the method can be generalized reliably in inferring relations and reducing the ambiguity and abstraction in privacy policies.
引用
收藏
页数:18
相关论文
共 25 条
  • [1] Disambiguating Requirements Through Syntax-Driven Semantic Analysis of Information Types
    Hosseini, Mitra Bokaei
    Slavin, Rocky
    Breaux, Travis
    Wang, Xiaoyin
    Niu, Jianwei
    [J]. REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY (REFSQ 2020), 2020, 12045 : 97 - 115
  • [2] Syntax-driven Approach for Semantic Role Labeling
    Tian, Yuanhe
    Qin, Han
    Xia, Fei
    Song, Yan
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7129 - 7139
  • [3] Syntax-Driven Semantic Analysis for Constructing Use Case Diagrams from Software Requirement Specifications in Indonesian
    Latifaah
    Manurung, Ruli
    [J]. 2012 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2012, : 149 - 154
  • [4] SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology
    Tanveer, Md Iftekhar
    Ture, Ferhan
    [J]. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 1 - 6
  • [5] Syntax-driven analysis of context-free languages with respect to fuzzy relational semantics
    Bergmair, Richard
    Bodenhofer, Ulrich
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2006, : 2075 - +
  • [6] PrivOnto: A semantic framework for the analysis of privacy policies
    Oltramari, Alessandro
    Piraviperumal, Dhivya
    Schaub, Florian
    Wilson, Shomir
    Cherivirala, Sushain
    Norton, Thomas B.
    Russell, N. Cameron
    Story, Peter
    Reidenberg, Joel
    Sadeh, Norman
    [J]. SEMANTIC WEB, 2018, 9 (02) : 185 - 203
  • [7] Information privacy: a comprehensive analysis of information request and privacy policies of most-visited Web sites
    Cha, Jiyoung
    [J]. ASIAN JOURNAL OF COMMUNICATION, 2011, 21 (06) : 613 - 631
  • [8] Privacy protection of enterprise information through inference analysis
    Chandramouli, R
    [J]. SIXTH IEEE INTERNATIONAL WORKSHOP ON POLICIES FOR DISTRIBUTED SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 47 - 56
  • [9] Enabling intelligence analysis through semantic information interoperability
    Maripuri, Sandeep
    Sokka, Thanikai
    Medairy, Brad
    [J]. 2005 IEEE Aerospace Conference, Vols 1-4, 2005, : 3200 - 3213
  • [10] More Data Types More Problems: A Temporal Analysis of Complexity, Stability, and Sensitivity in Privacy Policies
    Lovato, Juniper
    Mueller, Philip
    Suchdev, Parisa
    Dodds, Peter S.
    [J]. PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023, 2023, : 1088 - 1100