An analysis of design process and performance in distributed data science teams

被引:10
|
作者
Maier, Torsten [1 ]
DeFranco, Joanna [2 ]
Mccomb, Christopher [3 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Penn State Univ, Software Engn, University Pk, PA 16802 USA
[3] Penn State Univ, Engn, Main Campus, University Pk, PA 16802 USA
关键词
Teamwork; Data science; Distributed teams; Global teamwork; Kaggle data set; Software engineering teams; Technical teams; SOCIAL DILEMMAS; COMMUNICATION; COOPERATION; SIZE;
D O I
10.1108/TPM-03-2019-0024
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Purpose Often, it is assumed that teams are better at solving problems than individuals working independently. However, recent work in engineering, design and psychology contradicts this assumption. This study aims to examine the behavior of teams engaged in data science competitions. Crowdsourced competitions have seen increased use for software development and data science, and platforms often encourage teamwork between participants. Design/methodology/approach We specifically examine the teams participating in data science competitions hosted by Kaggle. We analyze the data provided by Kaggle to compare the effect of team size and interaction frequency on team performance. We also contextualize these results through a semantic analysis. Findings This work demonstrates that groups of individuals working independently may outperform interacting teams on average, but that small, interacting teams are more likely to win competitions. The semantic analysis revealed differences in forum participation, verb usage and pronoun usage when comparing top- and bottom-performing teams. Research limitations/implications - These results reveal a perplexing tension that must be explored further: true teams may experience better performance with higher cohesion, but nominal teams may perform even better on average with essentially no cohesion. Limitations of this research include not factoring in team member experience level and reliance on extant data. Originality/value These results are potentially of use to designers of crowdsourced data science competitions as well as managers and contributors to distributed software development projects.
引用
收藏
页码:419 / 439
页数:21
相关论文
共 50 条
  • [21] A design method for choosing services for large distributed teams
    Hawryszkiewycz, I
    COOP '96 - SECOND INTERNATIONAL WORKSHOP ON THE DESIGN OF COOPERATIVE SYSTEMS, 1996, : 515 - 533
  • [22] KNOWLEDGE SHARING OBSERVATION AND MODELLING IN DISTRIBUTED DESIGN TEAMS
    Horrigue, A. H.
    Choulier, D.
    Boudouh, T.
    9TH INTERNATIONAL DESIGN CONFERENCE - DESIGN 2006, VOLS 1 AND 2, 2006, (36): : 1155 - +
  • [23] Performance Evaluation of European Football Teams Using Data Envelopment Analysis
    El-Demerdash, Basma E.
    El-Khodary, Ihab A.
    Tharwat, Assem A.
    Shaban, Eslam R.
    INTERNATIONAL CONFERENCE ON INFORMATICS AND SYSTEMS (INFOS 2016), 2016, : 325 - 326
  • [24] DESIGN TEAMWORK IN DISTRIBUTED CROSS-CULTURAL TEAMS
    Man, Jinfan
    Lu, Yuan
    Brombacher, Aarnout
    DESIGN FOR HARMONIES, VOL 7: HUMAN BEHAVIOUR IN DESIGN, 2013,
  • [25] Distributed design teams pose no problem with video chatting
    Electron. Des., 2008, 2 (19-20):
  • [26] Enterprise Information Portals in support of business process, design teams and collaborative commerce performance
    Chang, Hsin Hsin
    Wang, I. Chen
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2011, 31 (02) : 171 - 182
  • [27] Key performance measurement capabilities for managing distributed teams
    Ferreira, Pedro Gustavo Siqueira
    de Lima, Edson Pinheiro
    da Costa, Sergio Eduardo Gouvea
    Monteiro, Nathalia Juca
    de Castro e Silva, Andreia
    TOTAL QUALITY MANAGEMENT & BUSINESS EXCELLENCE, 2023, 34 (9-10) : 1071 - 1095
  • [28] A unified framework for design and performance analysis of distributed systems
    Jonkers, H
    Janssen, W
    Verschut, A
    Wierstra, E
    IEEE INTERNATIONAL COMPUTER PERFORMANCE AND DEPENDABILITY SYMPOSIUM -PROCEEDINGS, 1998, : 109 - 118
  • [29] PERFORMANCE ANALYSIS OF DISTRIBUTED DATA-BASE SYSTEMS
    STONEBRAKER, M
    WOODFILL, J
    RANSTROM, J
    KALASH, J
    ARNOLD, K
    ANDERSEN, E
    PERFORMANCE EVALUATION, 1984, 4 (03) : 220 - 220
  • [30] USE OF PAYOFF TREES IN THE DISTRIBUTED DATA PROCESSING DESIGN PROCESS.
    Mariani, Michael P.
    1977,