Misinterpretations of Significance Testing Results Near the P-Value Threshold in the Urologic Literature

被引:0
|
作者
Manda, Pranay R. [1 ]
Kuchakulla, Manish [2 ]
Hochu, Gabrielle [3 ]
Mudiam, Pranav [4 ]
Watane, Arjun [5 ]
Syed, Ali [6 ]
Ghomeshi, Armin [7 ]
Ramasamy, Ranjith [8 ]
机构
[1] Emory Univ, Sch Med, Urol, Atlanta, GA USA
[2] Weill Cornell Med Ctr, Urol, New York, NY USA
[3] Univ Tennessee, Hlth Sci Ctr, Urol, Memphis, TN USA
[4] Univ Calif Berkeley, Data Sci, Berkeley, CA USA
[5] Yale Sch Med, Opthalmol, New Haven, CT USA
[6] Case Western Reserve Univ, Sch Med, Opthalmol, Cleveland, OH USA
[7] Florida Int Univ, Herbert Wertheim Coll Med, Psychiat, Miami, FL 33199 USA
[8] Univ Miami, Urol, Miami, FL USA
关键词
p-value; data; urology; statistical errors; statistics; STATISTICAL POWER;
D O I
10.7759/cureus.41556
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BackgroundThe outcome of a statistical test is to accept or reject a null hypothesis. Reporting a metric as "trending toward significance" is a misinterpretation of the p-value. Studies highlighting the prevalence of statistical errors in the urologic literature remain scarce. We evaluated abstracts from 15 urology journals published within the years 2000-2021 and provided a quantitative measure of a common statistical mistake misconstruing the function of null hypothesis testing by reporting "a trend toward significance."Materials and methodsWe performed an audit of 15 urology journals, looking at articles published from January 1, 2000, to January 1, 2022. A word recognition function in Microsoft Excel was utilized to identify the use of the word "trend" in the abstracts. Each use of the word "trend" was manually investigated by two authors to determine whether it was improperly used in describing non-statistically significant data as trending toward significance. Statistics and data analysis were performed using Python libraries: pandas, scipy.stats, and seaborn.ResultsThis study included 101,134 abstracts from 15 urology journals. Within those abstracts, the word "trend" was used 2,509 times, 572 uses of which were describing non-statistically significant data as trending toward significance. There was a statistically significant difference in the rate of errors between the 15 journals (p < 0.01). The highest rate of improper use of the word "trend" was found in Bladder Cancer with a rate of 1.6% (p < 0.01) of articles. The lowest rate of improper use was found in European Urology, with a rate of 0.3% (p < 0.01). Our analysis found a moderate correlation between the number of articles published and the number of misuses of the word "trend" within each journal and across all journals every year (r=0.61 and 0.70, respectively).ConclusionThe overall rate of p-value misinterpretation never exceeded 2% of articles in each journal. There is significance in the difference in misinterpretation rates between the different journals. Authors' utilization of the word "trend" describing non-significant p-values as being near significant should be used with caution.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] MISINTERPRETATIONS OF SIGNIFICANCE TESTING RESULTS NEAR THE P-VALUE THRESHOLD IN UROLOGIC LITERATURE
    Manda, Pranay
    Kuchakulla, Manish
    Hochu, Gabrielle
    Mudiam, Pranav
    Syed, Ali
    Shah, Aayush
    Watane, Arjun
    Ramasamy, Ranjith
    JOURNAL OF UROLOGY, 2023, 209 : E931 - E932
  • [2] Misinterpretations of Null Hypothesis Significance Testing Results Near the P-Value Threshold in the Neurosurgical Literature
    El Tecle, Najib E.
    Urquiaga, Jorge F.
    Griffin, Samuel T.
    Alexopoulos, Georgios
    El Ahmadieh, Tarek Y.
    Aoun, Salah G.
    Mattei, Tobias A.
    WORLD NEUROSURGERY, 2022, 159 : E192 - E198
  • [3] Pitfalls of significance testing and p-value variability: An econometrics perspective
    Hirschauer, Norbert
    Gruener, Sven
    Musshoff, Oliver
    Becker, Claudia
    STATISTICS SURVEYS, 2018, 12 : 136 - 172
  • [4] Significance level, p-value
    Sallat, Stephan
    SPRACHE-STIMME-GEHOR, 2024, 48 (01): : 13 - 13
  • [5] The p-Value Requires Context, Not a Threshold
    Betensky, Rebecca A.
    AMERICAN STATISTICIAN, 2019, 73 : 115 - 117
  • [6] Statistical hypothesis testing and common misinterpretations: Should we abandon p-value in forensic science applications?
    Taroni, F.
    Biedermann, A.
    Bozza, S.
    FORENSIC SCIENCE INTERNATIONAL, 2016, 259 : e32 - e36
  • [7] Statistical significance: Interpreting the p-value
    Koeppel, Maximilian
    Eckert, Katharina
    BEWEGUNGSTHERAPIE UND GESUNDHEITSSPORT, 2021, 37 (02): : 72 - 76
  • [8] The P-value and the problem of multiple testing
    Walters, Eurof
    REPRODUCTIVE BIOMEDICINE ONLINE, 2016, 32 (04) : 348 - 349
  • [9] Misconceptions, Misuses, and Misinterpretations of P Values and Significance Testing
    Gagnier, Joel J.
    Morgenstern, Hal
    JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2017, 99 (18): : 1598 - 1603
  • [10] A LATENT p-VALUE IN TESTING BY BOOTSTRAP
    Singh, Kesar
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2011, 21 (06) : 1232 - 1235