An Analysis of Structured Data on the Web

被引:20
|
作者
Dalvi, Nilesh [1 ]
Machanavajjhala, Ashwin [1 ]
Pang, Bo [1 ]
机构
[1] Yahoo Res, 4301 Great America Pkwy, Santa Clara, CA 95054 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2012年 / 5卷 / 07期
关键词
Structured Data on the Web; Information Spread; Information Connectivity;
D O I
10.14778/2180912.2180920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we analyze the nature and distribution of structured data on the Web. Web-scale information extraction, or the problem of creating structured tables using extraction from the entire web, is gathering lots of research interest. We perform a study to understand and quantify the value of Web-scale extraction, and how structured information is distributed amongst top aggregator websites and tail sites for various interesting domains. We believe this is the first study of its kind, and gives us new insights for information extraction over the Web.
引用
收藏
页码:680 / 691
页数:12
相关论文
共 50 条
  • [21] Automatic Extraction of Structured Web Data with Domain Knowledge
    Derouiche, Nora
    Cautis, Bogdan
    Abdessalem, Talel
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 726 - 737
  • [22] Best-Effort Modeling of Structured Data on the Web
    Halevy, Alon
    CONCEPTUAL MODELING - ER 2011, 2011, 6998 : 32 - 32
  • [23] Caught in the Web: DoS Vulnerabilities in Parsers for Structured Data
    Rasheed, Shawn
    Dietrich, Jens
    Tahir, Amjed
    COMPUTER SECURITY - ESORICS 2021, PT I, 2021, 12972 : 67 - 85
  • [24] Building web warehouse for semi-structured data
    Mohania, M
    DATA & KNOWLEDGE ENGINEERING, 2001, 39 (02) : 101 - 103
  • [25] Recent Progress Towards an Ecosystem of Structured Data on the Web
    Gupta, Nitin
    Halevy, Alon Y.
    Harb, Boulos
    Lam, Heidi
    Lee, Hongrae
    Madhavan, Jayant
    Wu, Fei
    Yu, Cong
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 5 - 8
  • [26] A Structured Approach to Data Reverse Engineering of Web Applications
    De Virgilio, Roberto
    Torlone, Riccardo
    WEB ENGINEERING, PROCEEDINGS, 2009, 5648 : 91 - 105
  • [27] Answering Web Queries Using Structured Data Sources
    Paparizos, Stelios
    Ntoulas, Alexandros
    Shafer, John
    Agrawal, Rakesh
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 1127 - 1129
  • [28] Structured web pages management for efficient data retrieval
    Taniar, D
    Jiang, Y
    Rahayu, JW
    Bishay, L
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, VOL II, 2000, : 97 - 104
  • [29] The Integration of Web-Based Information and the Structured Data in Data Warehousing
    Maslankowski, Jacek
    INFORMATION SYSTEMS: DEVELOPMENT, LEARNING, SECURITY, 2013, 161 : 66 - 75
  • [30] Analysis of structured data on wikipedia
    Moreira, Johny
    Neto, Everaldo Costa
    Barbosa, Luciano
    2021, Inderscience Publishers (15) : 71 - 86