The "BigSE" Project: Lessons Learned from Validating Industrial Text Mining

被引:0
|
作者
Krishna, Rahul
Yu, Zhe
Agrawal, Amritanshu
Dominguez, Manuel [1 ]
Wolf, David [1 ]
机构
[1] LexisNexis, Raleigh, NC 27606 USA
关键词
E-Discovery; Software Engineering; Testing;
D O I
10.1145/2896825.2896836
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As businesses become increasingly reliant on big data analytics, it becomes increasingly important to test the choices made within the data miners. This paper reports lessons learned from the BigSE Lab, an industrial/university collaboration that augments industrial activity with low-cost testing of data miners (by graduate students). BigSE is an experiment in academic/industrial collaboration. Funded by a gift from LexisNexis, BigSE has no specific deliverables. Rather, it is fueled by a research question "what can industry and academia learn from each other?". Based on open source data and tools, the output of this work is (a) more exposure by commercial engineers to state-of-the-art methods and (b) more exposure by students to industrial text mining methods (plus research papers that comment on methods on how to improve those methods). The results so far are encouraging. Students at BigSE Lab have found numerous "standard" choices for text mining that could be replaced by simpler and less resource intensive methods. Further, that work also found additional text mining choices that could significantly improve the performance of industrial data miners.
引用
下载
收藏
页码:65 / 71
页数:7
相关论文
共 50 条
  • [21] Data mining in the real world: Lessons learned from the mining pit
    De Veaux, Richard D.
    Proceedings of the ITI 2007 29th International Conference on Information Technology Interfaces, 2007, : 15 - 15
  • [22] Lessons Learned from Mining the Hugging Face Repository
    Castano, Joel
    Martinez-Fernandez, Silverio
    Franch, Xavier
    PROCEEDINGS OF THE 2024 IEEE/ACM INTERNATIONAL WORKSHOP ON METHODOLOGICAL ISSUES WITH EMPIRICAL STUDIES IN SOFTWARE ENGINEERING, WSESE 2024, 2024, : 1 - 6
  • [23] @Baycrest - Lessons learned from an Intranet development project
    Crockford, S
    Phillips, J
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, : 815 - 815
  • [24] Lessons learned from infrastructure operation in the CUTE project
    Stolzenburg, K.
    Tsatsami, V.
    Grubel, H.
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2009, 34 (16) : 7114 - 7124
  • [25] Lessons learned from an overambitious undergraduate research project
    Trethewey, Samuel P.
    CLINICAL TEACHER, 2019, 16 (02): : 168 - 168
  • [26] Lessons learned from an emergency bridge replacement project
    Bai, Y
    Burkett, WR
    Nash, PT
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT-ASCE, 2006, 132 (04): : 338 - 344
  • [27] Lessons learned from a nationwide CBD promotion project
    Kim, SD
    COMMUNICATIONS OF THE ACM, 2002, 45 (10) : 83 - 87
  • [28] Lessons learned from the Case Study Zagreb Project
    Skanata, D
    Subasic, D
    INTERNATIONAL JOURNAL OF ENVIRONMENT AND POLLUTION, 1996, 6 (4-6) : 662 - 682
  • [29] PROJECT MANAGEMENT AND LESSONS LEARNED
    Tull, A.
    GERONTOLOGIST, 2009, 49 : 503 - 503
  • [30] Architecture and ageing: lessons learned from a cohousing project
    Schaff, Gwendoline
    Vanrie, Jan
    Courtejoie, Fabienne
    Elsen, Catherine
    Petermans, Ann
    JOURNAL OF HOUSING AND THE BUILT ENVIRONMENT, 2023, 38 (04) : 2345 - 2371