ABSTRACT

In the analysis of regression models for censored survival data, one often wishes to assess the importance of certain prognostic factors such as age, gender, or race in predicting survival outcome. This is a general problem which is encountered in most clinical trials research in cancer and AIDS. Typically, proportional hazards regression models using Cox's partial likelihood (Cox, 1975) are used to address this problem. Current techniques used for variable selection include asymptotic procedures based on score tests, Wald tests, and other approximate chi-square procedures. These procedures rely on Cox's partial likelihood and therefore do not use the full likelihood function to do variable selection. As is well known, Cox's partial likelihood is an approximation to the full likelihood. Frequentist variable selection based on the full likelihood requires joint estimation of the baseline hazard and the regression coefficients. In this case, a non parametric estimate of the baseline hazard rate or a fully parametric specification of the survival model would be required. Again, one needs to rely on asymptotics to obtain variable selection criteria. Joint estimation of the baseline hazard and the regression coefficients can be a very difficult task. We have not seen any methods in the statistical literature that address this in the variable selection context.