Loss amount prediction from textual data using a double GLM with shrinkage and selection

被引:0
|
作者
Scott Manski
Kaixu Yang
Gee Y. Lee
Tapabrata Maiti
机构
[1] Pfizer Inc.,Department of Statistics and Probability
[2] LinkedIn Corporation,Department of Mathematics
[3] Michigan State University,undefined
[4] Michigan State University,undefined
来源
关键词
Insurance analytics; Claims prediction; Loss reserving; Word2vec; Word embedding matrix; Gamma double group lasso;
D O I
暂无
中图分类号
学科分类号
摘要
The Gamma model has been widely utilized in a variety of fields, including actuarial science, where it has important applications in insurance loss predictions. Meanwhile, high dimensional models and their applications have become more common in the statistics literature in recent years. The availability of such high dimensional models have allowed the analysis of non-traditional data, including those containing textual descriptions of the response. In the models used in such applications, the dispersion may be designed to be related to a set of covariates, as opposed to being a single fixed value for the entire population. Following this approach, we incorporate a group Lasso type penalty in both the dispersion and the mean parameterization for a Gamma model, and illustrate its use in a predictive analytics application in actuarial science. In particular, we apply the method to an insurance claim prediction problem involving textual data analysis methods. Simulations are conducted to illustrate the variable selection and model fitting performance of our method.
引用
收藏
页码:503 / 528
页数:25
相关论文
共 50 条
  • [1] Loss amount prediction from textual data using a double GLM with shrinkage and selection
    Manski, Scott
    Yang, Kaixu
    Lee, Gee Y.
    Maiti, Tapabrata
    [J]. EUROPEAN ACTUARIAL JOURNAL, 2022, 12 (02) : 503 - 528
  • [2] A model selection of GLM applied to fMRI data using AlC
    Watanabe, Jobu
    Miwakeichi, Fumikazu
    Galka, Andreas
    Kawashima, Ryuta
    Ozaki, Tohru
    Uchida, Sunao
    [J]. NEUROSCIENCE RESEARCH, 2006, 55 : S260 - S260
  • [3] Effectiveness of Shrinkage and Variable Selection Methods for the Prediction of Complex Human Traits using Data from Distantly Related Individuals
    Berger, Swetlana
    Perez-Rodriguez, Paulino
    Veturi, Yogasudha
    Simianer, Henner
    de los Campos, Gustavo
    [J]. ANNALS OF HUMAN GENETICS, 2015, 79 (02) : 122 - 135
  • [4] Optimal portfolio selection using a simple double-shrinkage selection rule
    Joo, Young C.
    Park, Sung Y.
    [J]. FINANCE RESEARCH LETTERS, 2021, 43
  • [5] Transductive Learning from Textual Data with Relevant Example Selection
    Ceci, Michelangelo
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT 2, 2010, 6262 : 470 - 484
  • [6] Ensemble feature selection approach for imbalanced textual data using MapReduce
    Amazal, Houda
    Ramdani, Mohammed
    Kissi, Mohamed
    [J]. International Journal of Business Intelligence and Data Mining, 2021, 19 (04) : 395 - 417
  • [7] Disease Prediction using Optimal Feature Selection from Epigenetic Data
    Siyad, Mohammed B.
    Visakh, R.
    Nazeer, K. A. Abdul
    [J]. 2017 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2017,
  • [8] Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals (vol 79, pg 122, 2015)
    Berger, S.
    Perez-Rodriguez, P.
    Veturi, Y.
    Simianer, H.
    de los Campos, G.
    [J]. ANNALS OF HUMAN GENETICS, 2018, 82 (02) : 127 - 127
  • [9] Sentiment Prediction of Textual Data Using Hybrid ConvBidirectional-LSTM Model
    Mahto, Dashrath
    Yadav, Subhash Chandra
    Lalotra, Gotam Singh
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [10] Selection of Core Words from Textual Patent Data with DEA based on Citation
    Onoda, Shigeaki
    Okuhara, Koji
    [J]. 2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 175 - 180