Towards Building a Scholarly Big Data Platform: Challenges, Lessons and Opportunities

被引:0
|
作者
Wu, Zhaohui [1 ]
Wu, Jian [2 ]
Khabsa, Madian [1 ]
Williams, Kyle [2 ]
Chen, Hung-Hsuan [1 ]
Huang, Wenyi [2 ]
Tuarob, Suppawong [1 ]
Choudhury, Sagnik Ray [2 ]
Ororbia, Alexander [2 ]
Mitra, Prasenjit [1 ,2 ]
Giles, C. Lee [1 ,2 ]
机构
[1] Penn State Univ, Comp Sci & Engn, University Pk, PA 16802 USA
[2] Penn State Univ, Informat Sci & Technol, University Pk, PA 16802 USA
关键词
Scholarly Big Data; Information Extraction; Big Data; EXTRACTION; TABLE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce a big data platform that provides various services for harvesting scholarly information and enabling efficient scholarly applications. The core architecture of the platform is built on a secured private cloud, crawls data using a scholarly focused crawler that leverages a dynamic scheduler, processes by utilizing a map reduce based crawl extraction -ingestion (CEI) workflow, and is stored in distributed repositories and databases. Services such as scholarly data harvesting, information extraction, and user information and log data analytics are integrated into the platform and provided by an OAI and RESTful API. We also introduce a set of scholarly applications built on top of this platform including citation recommendation and collaborator discovery.
引用
收藏
页码:117 / 126
页数:10
相关论文
共 50 条
  • [1] Building a data-sharing platform for schistosomiasis treatment data: opportunities and challenges
    Jule, A.
    Garba, A.
    Guerin, P.
    Lang, T.
    Olliaro, P.
    [J]. TROPICAL MEDICINE & INTERNATIONAL HEALTH, 2015, 20 : 228 - 229
  • [2] Big Data - Opportunities and Challenges
    Bertino, Elisa
    [J]. 2013 IEEE 37TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2013, : 479 - 480
  • [3] Challenges and Opportunities with Big Data
    Labrinidis, Alexandros
    Jagadish, H. V.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2032 - 2033
  • [4] BIG DATA PROCESSING: BIG CHALLENGES AND OPPORTUNITIES
    Ji, Changqing
    Li, Yu
    Qiu, Wenming
    Jin, Yingwei
    Xu, Yujie
    Awada, Uchechukwu
    Li, Keqiu
    Qu, Wenyu
    [J]. JOURNAL OF INTERCONNECTION NETWORKS, 2012, 13 (3-4)
  • [5] Building a Big Data Platform for Smart Cities: Experience and Lessons from Santander
    Cheng, Bin
    Longo, Salvatore
    Cirillo, Flavio
    Bauer, Martin
    Kovacs, Ernoe
    [J]. 2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 592 - 599
  • [6] Big Building Data - a Big Data Platform for Smart Buildings
    Linder, Lucy
    Vionnet, Damien
    Bacher, Jean-Philippe
    Hennebert, Jean
    [J]. CISBAT 2017 INTERNATIONAL CONFERENCE FUTURE BUILDINGS & DISTRICTS - ENERGY EFFICIENCY FROM NANO TO URBAN SCALE, 2017, 122 : 589 - 594
  • [7] Big Data in Healthcare: Opportunities and Challenges
    Craven, Mark
    Page, C. David
    [J]. BIG DATA, 2015, 3 (04) : 209 - 210
  • [8] Geospatial Big Data: Challenges and Opportunities
    Lee, Jae-Gil
    Kang, Minseo
    [J]. BIG DATA RESEARCH, 2015, 2 (02) : 74 - 81
  • [9] Big Data in healthcare: Challenges and Opportunities
    Asri, Hiba
    Mousannif, Hajar
    Al Moatassime, Hassan
    Noel, Thomas
    [J]. 2015 INTERNATIONAL CONFERENCE ON CLOUD TECHNOLOGIES AND APPLICATIONS (CLOUDTECH 15), 2015, : 56 - 62
  • [10] The Promise of Big Data Opportunities and Challenges
    Krumholz, Harlan M.
    [J]. CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2016, 9 (06): : 616 - 617