CGLAD: Using GLAD in Crowdsourced Large Datasets

被引:2
|
作者
Rodrigo, Enrique G. [1 ]
Aledo, Juan A. [2 ]
Gamez, Jose A. [1 ]
机构
[1] Castilla La Mancha Univ, Comp Syst Dept, Albacete, Spain
[2] Castilla La Mancha Univ, Dept Math, Albacete, Spain
关键词
Non-standard classification; Crowdsourcing; Multiple annotators; Weakly supervised;
D O I
10.1007/978-3-030-03493-1_81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose an improvement over the GLAD algorithm that increases the efficiency and accuracy of the model when working on problems with large datasets. The GLAD algorithm allows practitioners to learn from instances labeled by multiple annotators, taking into account the quality of their annotations and the instance difficulty. However, due to the number of parameters of the model, it does not scale easily to solve problems with large datasets, especially when the execution time is limited. Our proposal, CGLAD, solves these problems using clustering from vectors coming from the factorization of the annotation matrix. This approach drastically reduces the number of parameters in the model, which makes using GLAD strategy for solving multiple annotators problems easier to use and more efficient.
引用
收藏
页码:783 / 791
页数:9
相关论文
共 50 条
  • [1] Modeling Local Demand for Mobile Spectrum using Large Crowdsourced Datasets
    Parekh, Janaki
    Yackoboski, Elizabeth
    Ghasemi, Amir
    Yanikomeroglu, Halim
    [J]. 2023 IEEE FUTURE NETWORKS WORLD FORUM, FNWF, 2024,
  • [2] Deep Learning based Localization of LTE eNodeBs from Large Crowdsourced Smartphone Datasets
    Ghasemi, Amir
    Parekh, Janaki
    [J]. 2021 IEEE 93RD VEHICULAR TECHNOLOGY CONFERENCE (VTC2021-SPRING), 2021,
  • [3] Unveiling Cellular Antenna Orientations from Large Crowdsourced Datasets: A Deep Learning Approach
    Eller, Lukas
    Svoboda, Philipp
    Rupp, Markus
    [J]. 2022 18TH INTERNATIONAL CONFERENCE ON WIRELESS AND MOBILE COMPUTING, NETWORKING AND COMMUNICATIONS (WIMOB), 2022,
  • [4] Using Large Datasets to Understand CKD
    Drysdale, Thomas A.
    [J]. JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2018, 29 (05): : 1351 - 1353
  • [5] Using Large Datasets to Understand Nanotechnology
    Paunovska, Kalina
    Loughrey, David
    Sago, Cory D.
    Langer, Robert
    Dahlman, James E.
    [J]. ADVANCED MATERIALS, 2019, 31 (43)
  • [6] Comparing spatial patterns of crowdsourced and conventional bicycling datasets
    Conrow, Lindsey
    Wentz, Elizabeth
    Nelson, Trisalyn
    Pettit, Christopher
    [J]. APPLIED GEOGRAPHY, 2018, 92 : 21 - 30
  • [7] Using crowdsourced and weather station data to fill cloud gaps in MODIS snow cover datasets
    Kadlec, Jiri
    Ames, Daniel P.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2017, 95 : 258 - 270
  • [8] Measuring the Output Gap using Large Datasets
    Barigozzi, Matteo
    Luciani, Matteo
    [J]. REVIEW OF ECONOMICS AND STATISTICS, 2023, 105 (06) : 1500 - 1514
  • [9] Using bitmap index for interactive exploration of large datasets
    Wu, KS
    Koegler, W
    Chen, J
    Shoshani, A
    [J]. SSDBM 2002: 15TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2003, : 65 - 74
  • [10] Clustering large dynamic datasets using exemplar points
    Sia, W
    Lazarescu, MM
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINDS, 2005, 3587 : 163 - 173