当响应变量为连续比例数据时,即其取值为(0,1)区间时,经典的线性回归或者数据变换方法的结果往往不理想。这种情况下可以使用Ferrari and Cribari-Neto提出的beta回归模型。但是传统的beta回归模型仅局限于参数线性回归,模型的灵活度不高。本文提出了半参数可加beta回归模型以及参数估计方法,通过模拟发现具有良好的效果。另外,将半参数可加beta回归应用于医疗支出占家庭总支出比例的实证分析上,探讨了影响医疗支出占家庭总支出比例的影响因素。
In regression analysis, classical linear regression or its transformation methods are not satisfied when response variable is restricted to the interval (0, 1), that is, proportional or fractional data, which is common in Economics, education, medical science etc. One of the most promising approaches is the beta regression proposed by Ferrari and Cribari-Neto. However, the traditional beta regression is confined in the linear situation and thus lacks flexibility. Besides, it has specification error if the true model is not linear. Borrow the idea from generalized additive model (GAM) proposed by Hastie and Tibshirani, a semi-parametric additive beta regression model is proposed. It is assumed the model can be decomposed into parametric and nonparametric parts. For the nonparametric part, the local scoring algorithm is used to fit the unknown function and AIC is used to choose the best smoothing (tuning) parameters. Two simulation examples under different scenarios are conducted, the simulation results shows that semi-parametric beta regression model perform well. Comparing to traditional models, the proposed semi-parametric beta regression model is the best and is significantly better than other traditional models. The proposed model is applied on medical expenditure data to explore the factors of the medical expenditure portion in patients' overall expenditure. It is found marital status, age of householder, income, the number of inpatient and outpatient are the significant factor for the proportion of medical expenditure in overall expenditure.
[1] Ferrari S, Cribari-Neto F. Beta regression for modelling rates and proportions[J]. Journal of Applied Statistics, 2004, 31(7):799-815.
[2] Branscum A J, Johnson W O, Thurmond M C. Bayesian beta regression:Applications to household expenditure data and genetic distance between foot-and-mouth disease viruses[J]. Australian & New Zealand Journal of Statistics, 2007, 49(3):287-301.
[3] Ospina R, Cribari-Neto F, Vasconcellos K L P. Improved point and interval estimation for a beta regression model[J]. Computational Statistics & Data Analysis, 2006, 51(2):960-981.
[4] Simas A B, Barreto-Souza W, Rocha A V. Improved estimators for a general class of beta regression models[J]. Computational Statistics & Data Analysis, 2010, 54(2):348- 366.
[5] Ospina R, Ferrari S. A general class of zero-or-one inflated beta regression models[J]. Computational Statistics and Data Analysis, 2012, 56(6):1609-1623.
[6] Cook D O, Kieschnick R, Mccullogh B D. Regression analysis of proportions in finance with self-selection[J]. Journal of empirical finance, 2008, 15(5):860-867.
[7] Pereira G H A, Botter D A, Sandoval M C. The truncated inflated beta distribution[J]. Communications in Statistics-Theory and Methods, 2012, 41(5):907-919.
[8] Pereira T L, Cribari-Neto F. Detecting model misspeciflcation in inflated beta regressions[J]. Communications in Statistics-Simulation and Computation,2014, 43(3):631-656.
[9] Espinheira P L, Ferrari S L P, Cribari-Neto F. Influence diagnostics in beta regression[J]. Computational Statistics and Data Analysis, 2008, 52(9):4417-4431.
[10] Chien L C. Diagnostic plots in beta-regression models[J]. Journal of Applied Statistics, 2011, 38(8):1607-1622.
[11] Anholeto T, Sandoval D A, Botter D A. Adjusted pearson residuals in beta regression models[J]. Journal of Statistical Computation and Simulation, 2014, 84(5):999-1014.
[12] Hastie T, Tibshirani R. Generalized additive models[J]. Statistical Science, 1986,3(1):297-310.
[13] 邵臻,杨善林,高飞,等. 基于可变区间权重的中期用电量半参数预测模型[J].中国管理科学,2015, 23(3):123-129.
[14] Fang Kuangnan, Jiang Yefei, Shia B, et al. Impact of illness and medical expenditure on household consumptions:A survey in western China[J]. PLoS ONE. 2012, 7(12):1-8.
[15] Smith J P. Healthy bodies and thick wallets:The dual relation between health and economics status[J]. Journal of Economic Perspectives, 1999,13(2):145-166.
[16] HimmelsteinD U, Warren E, Thorne D, et al. Market watch:Illness and injury as contributors to bankruptcy[J]. Health Affairs, 2006,25(5):84-88.
[17] DerconS, Krishnan P. In sickness and in health:Risk sharing within households in rural ethipoia[J]. Journal of Political Economy, 2000,108(4):688-724.
[18] Gertler P, Gruber J. Insuring consumption against illness[J].American Economic Review, 2002,92(1):51-76.
[19] 齐良书.新型农村合作医疗的减贫、增收和再分配效果研究[J].数量经济技术经济研究,2011,(8):35-52.
[20] 曲卫华,颜志军. 环境污染、经济增长与医疗卫生服务对公共健康的影响分析——基于中国省际面板数据的研究[J].中国管理科学,2015,23(7):166-176.