Loss amount prediction from textual data using a double GLM with shrinkage and selection

被引:0
|
作者
Scott Manski
Kaixu Yang
Gee Y. Lee
Tapabrata Maiti
机构
[1] Pfizer Inc.,Department of Statistics and Probability
[2] LinkedIn Corporation,Department of Mathematics
[3] Michigan State University,undefined
[4] Michigan State University,undefined
来源
关键词
Insurance analytics; Claims prediction; Loss reserving; Word2vec; Word embedding matrix; Gamma double group lasso;
D O I
暂无
中图分类号
学科分类号
摘要
The Gamma model has been widely utilized in a variety of fields, including actuarial science, where it has important applications in insurance loss predictions. Meanwhile, high dimensional models and their applications have become more common in the statistics literature in recent years. The availability of such high dimensional models have allowed the analysis of non-traditional data, including those containing textual descriptions of the response. In the models used in such applications, the dispersion may be designed to be related to a set of covariates, as opposed to being a single fixed value for the entire population. Following this approach, we incorporate a group Lasso type penalty in both the dispersion and the mean parameterization for a Gamma model, and illustrate its use in a predictive analytics application in actuarial science. In particular, we apply the method to an insurance claim prediction problem involving textual data analysis methods. Simulations are conducted to illustrate the variable selection and model fitting performance of our method.
引用
收藏
页码:503 / 528
页数:25
相关论文
共 50 条
  • [31] Building a contextual dimension for OLAP using textual data from social networks
    Gutierrez-Batista, Karel
    Campana, Jesus R.
    Vila, Maria-Amparo
    Martin-Bautista, Maria J.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 93 : 118 - 133
  • [32] Multi-step ahead prediction of taxi demand using time-series and textual data
    Markou, Ioulia
    Rodrigues, Filipe
    Pereira, Francisco C.
    [J]. URBAN MOBILITY - SHAPING THE FUTURE TOGETHER, 2019, 41 : 540 - 544
  • [33] Cancer detection from textual data using a combination of machine learning approach
    Salmanpoursohi, Bita
    Daneshvar, Amir
    Salmanpoursohi, Shakiba
    Chobar, Adel Pourghader
    Salahi, Fariba
    [J]. INTERDISCIPLINARY JOURNAL OF MANAGEMENT STUDIES, 2024, 17 (03): : 1001 - 1014
  • [34] Gas emission prediction from coalface based on Least Absolute Shrinkage and Selection Operator and Least Angle Regression
    Chen, Qian
    Huang, Lianbing
    [J]. Meitan Kexue Jishu/Coal Science and Technology (Peking), 2022, 50 (07): : 171 - 176
  • [35] Performance prediction model for cloud service selection from smart data
    Al-Faifi, Abdullah Mohammed
    Song, Biao
    Hassan, Mohammad Mehedi
    Alamri, Atif
    Gumaei, Abdu
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 85 : 97 - 106
  • [36] On selection biases with prediction rules formed from gene expression data
    Zhu, J. X.
    McLachlan, G. J.
    Ben-Tovim Jones, L.
    Wood, I. A.
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2008, 138 (02) : 374 - 386
  • [37] Prediction of Deceleration Amount of Vehicle Speed in Snowy Urban Roads using Weather Information and Traffic Data
    Tanimura, Ryosuke
    Hiromori, Akihito
    Yamaguchi, Hirozumi
    Higashino, Teruo
    Umedu, Takaaki
    [J]. 2015 IEEE 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, : 2268 - 2273
  • [38] Efficient attribute selection technique for leukaemia prediction using microarray gene data
    Santhakumar, D.
    Logeswari, S.
    [J]. SOFT COMPUTING, 2020, 24 (18) : 14265 - 14274
  • [39] Estimating Reference Evapotranspiration using Data Mining Prediction Models and Feature Selection
    Caminha, Hinessa Dantas
    da Silva, Ticiana Coelho
    da Rocha, Atslands Rego
    Vieira Lima, Silvio Carlos R.
    [J]. ICEIS: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 1, 2017, : 272 - 279
  • [40] Survival Prediction of ICU Patients using Knowledge Intensive Data Grouping and Selection
    Masud, Mohammad M.
    Cheratta, Muhsin
    Al Harahsheh, Abdel Rahman
    [J]. 2017 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY AND ACCESSIBILITY (ICTA), 2017,