Dynamic Hierarchical Markov Random Fields for integrated web data extraction

被引:0
|
作者
Zhu, Jun [1 ]
Nie, Zaiqing [2 ]
Zhang, Bo [1 ]
Wen, Ji-Rong [2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Microsoft Res Asia, Web Search & Min Grp, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
conditional random fields; Dynamic Hierarchical Markov Random Fields; integrated web data extraction; statistical hierarchical modeling; blocky artifact issue;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies-attempting to do data record detection and attribute labeling in two separate phases. In this paper, we propose an integrated web data extraction paradigm with hierarchical models. The proposed model is called Dynamic Hierarchical Markov Random Fields (DHMRFs). DHMRFs take structural uncertainty into consideration and define a joint distribution of both model structure and class labels. The joint distribution is an exponential family distribution. As a conditional model, DHMRFs relax the independence assumption as made in directed models. Since exact inference is intractable, a variational method is developed to learn the model's parameters and to find the MAP model structure and label assignments. We apply DHMRFs to a real-world web data extraction task. Experimental results show that: (1) integrated web data extraction models can achieve significant improvements on both record detection and attribute labeling compared to decoupled models; (2) in diverse web data extraction DHMRFs can potentially address the blocky artifact issue which is suffered by fixed-structured hierarchical models.
引用
收藏
页码:1583 / 1614
页数:32
相关论文
共 50 条
  • [1] Dynamic hierarchical Markov random fields for integrated web data extraction
    Zhu, Jun
    Nie, Zaiqing
    Zhang, Bo
    Wen, Ji-Rong
    [J]. Journal of Machine Learning Research, 2008, 9 : 1583 - 1614
  • [2] Dynamic Markov Random Fields
    Torr, P. H. S.
    [J]. 2008 INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, PROCEEDINGS, 2008, : 21 - 26
  • [3] Markov Random Fields for Pattern Extraction in Analog Wafer Test Data
    Schrunner, Stefan
    Bluder, Olivia
    Zernig, Anja
    Kaestner, Andre
    Kern, Roman
    [J]. PROCEEDINGS OF THE 2017 SEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA 2017), 2017,
  • [4] Data-driven segmentation of textured images using hierarchical Markov random fields
    Communications Research Lab, Kobe, Japan
    [J]. Syst Comput Jpn, 5 (43-53):
  • [5] Hierarchical semi-Markov conditional random fields for deep recursive sequential data
    Truyen Tran
    Dinh Phung
    Hung Bui
    Venkatesh, Svetha
    [J]. ARTIFICIAL INTELLIGENCE, 2017, 246 : 53 - 85
  • [6] Markov random fields for digital terrain model extraction
    Tupin, F
    Roux, M
    [J]. IEEE/ISPRS JOINT WORKSHOP ON REMOTE SENSING AND DATA FUSION OVER URBAN AREAS, 2001, : 95 - 99
  • [7] EXTRACTION OF MARKOV RANDOM FIELDS FROM A NOISE BACKGROUND
    BOTNEV, VN
    ZABOLOTSKIKH, VG
    [J]. ENGINEERING CYBERNETICS, 1973, 11 (04): : 675 - 679
  • [8] DATA-DRIVEN SEGMENTATION OF TEXTURED IMAGES USING HIERARCHICAL MARKOV RANDOM-FIELDS
    NODA, H
    SHIRAZI, MN
    [J]. SYSTEMS AND COMPUTERS IN JAPAN, 1995, 26 (05) : 43 - 53
  • [9] A HIERARCHICAL MARKOV RANDOM FIELD FOR ROAD NETWORK EXTRACTION AND ITS APPLICATION WITH OPTICAL AND SAR DATA
    Perciano, Talita
    Tupin, Florence
    Hirata, Roberto, Jr.
    Cesar, Roberto M., Jr.
    [J]. 2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 1159 - 1162
  • [10] CONTEXTUAL UNMIXING OF GEOSPATIAL DATA BASED ON MARKOV RANDOM FIELDS AND CONDITIONAL RANDOM FIELDS
    Nishii, Ryuei
    Ozaki, Tomohiko
    [J]. 2009 FIRST WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING, 2009, : 478 - +