Proportional hazards model

Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. The accelerated failure time model describes a situation where the biological or mechanical life history of an event is accelerated.

Background

Survival models can be viewed as consisting of two parts: the underlying baseline hazard function, often denoted, describing how the risk of event per time unit changes over time at baseline levels of covariates; and the effect parameters, describing how the hazard varies in response to explanatory covariates. A typical medical example would include covariates such as treatment assignment, as well as patient characteristics such as age at start of study, gender, and the presence of other diseases at start of study, in order to reduce variability and/or control for confounding.
The proportional hazards condition states that covariates are multiplicatively related to the hazard. In the simplest case of stationary coefficients, for example, a treatment with a drug may, say, halve a subject's hazard at any given time, while the baseline hazard may vary. Note however, that this does not double the lifetime of the subject; the precise effect of the covariates on the lifetime depends on the type of. The covariate is not restricted to binary predictors; in the case of a continuous covariate, it is typically assumed that the hazard responds exponentially; each unit increase in results in proportional scaling of the hazard.

The Cox model

The Cox partial likelihood, shown below, is obtained by using Breslow's estimate of the baseline hazard function, plugging it into the full likelihood and then observing that the result is a product of two factors. The first factor is the partial likelihood shown below, in which the baseline hazard has "canceled out". The second factor is free of the regression coefficients and depends on the data only through the censoring pattern. The effect of covariates estimated by any proportional hazards model can thus be reported as hazard ratios.
Sir David Cox observed that if the proportional hazards assumption holds then it is possible to estimate the effect parameter without any consideration of the hazard function. This approach to survival data is called application of the Cox proportional hazards model, sometimes abbreviated to Cox model or to proportional hazards model. However, Cox also noted that biological interpretation of the proportional hazards assumption can be quite tricky.
Let be the realized values of the covariates for subject i. The hazard function for the Cox proportional hazards model has the form
This expression gives the hazard function at time t for subject i with covariate vector X_i.
The likelihood of the event to be observed occurring for subject i at time Y_i can be written as:
where ) and the summation is over the set of subjects j where the event has not occurred before time Y_i. Obviously 0 < L_i ≤ 1. This is a partial likelihood: the effect of the covariates can be estimated without the need to model the change of the hazard over time.
Treating the subjects as if they were statistically independent of each other, the joint probability of all realized events is the following partial likelihood, where the occurrence of the event is indicated by C_i = 1:
The corresponding log partial likelihood is
This function can be maximized over β to produce maximum partial likelihood estimates of the model parameters.
The partial score function is
and the Hessian matrix of the partial log likelihood is
Using this score function and Hessian matrix, the partial likelihood can be maximized using the Newton-Raphson algorithm. The inverse of the Hessian matrix, evaluated at the estimate of β, can be used as an approximate variance-covariance matrix for the estimate, and used to produce approximate standard errors for the regression coefficients.

Tied times

Several approaches have been proposed to handle situations in which there are ties in the time data. Breslow's method describes the approach in which the procedure described above is used unmodified, even when ties are present. An alternative approach that is considered to give better results is Efron's method. Let t_j denote the unique times, let H_j denote the set of indices i such that Y_i = t_j and C_i = 1, and let m_j = |H_j|. Efron's approach maximizes the following partial likelihood.
The corresponding log partial likelihood is
the score function is
and the Hessian matrix is
where
Note that when H_j is empty, the summands in these expressions are treated as zero.

Time-varying predictors and coefficients

Extensions to time dependent variables, time dependent strata, and multiple events per subject, can be incorporated by the counting process formulation of Andersen and Gill. One example of the use of hazard models with time-varying regressors is estimating the effect of unemployment insurance on unemployment spells.
In addition to allowing time-varying covariates, the Cox model may be generalized to time-varying coefficients as well. That is, the proportional effect of a treatment may vary with time; e.g. a drug may be very effective if administered within one month of morbidity, and become less effective as time goes on. The hypothesis of no change with time of the coefficient may then be tested. Details and software are available in Martinussen and Scheike. The application of the Cox model with time-varying covariates is considered in reliability mathematics.
In this context, it could also be mentioned that it is theoretically possible to specify the effect of covariates by using additive hazards, i.e. specifying
If such additive hazards models are used in situations where likelihood maximization is the objective, care must be taken to restrict to non-negative values. Perhaps as a result of this complication, such models are seldom seen. If the objective is instead least squares the non-negativity restriction is not strictly required.

Specifying the baseline hazard function

The Cox model may be specialized if a reason exists to assume that the baseline hazard follows a particular form. In this case, the baseline hazard is replaced by a given function. For example, assuming the hazard function to be the Weibull hazard function gives the Weibull proportional hazards model.
Incidentally, using the Weibull baseline hazard is the only circumstance under which the model satisfies both the proportional hazards, and accelerated failure time models.
The generic term parametric proportional hazards models can be used to describe proportional hazards models in which the hazard function is specified. The Cox proportional hazards model is sometimes called a semiparametric model by contrast.
Some authors use the term Cox proportional hazards model even when specifying the underlying hazard function, to acknowledge the debt of the entire field to David Cox.
The term Cox regression model is sometimes used to describe the extension of the Cox model to include time-dependent factors. However, this usage is potentially ambiguous since the Cox proportional hazards model can itself be described as a regression model.

Relationship to Poisson models

There is a relationship between proportional hazards models and Poisson regression models which is sometimes used to fit approximate proportional hazards models in software for Poisson regression. The usual reason for doing this is that calculation is much quicker. This was more important in the days of slower computers but can still be useful for particularly large data sets or complex problems. Laird and Olivier provide the mathematical details. They note, "we do not assume is true, but simply use it as a device for deriving the likelihood." McCullagh and Nelder's book on generalized linear models has a chapter on converting proportional hazards models to generalized linear models.

Under high-dimensional setup

In high-dimension, when number of covariates p is large compared to the sample size n, the LASSO method is one of the classical model-selection strategies. Tibshirani has proposed a Lasso procedure for the proportional hazard regression parameter. The Lasso estimator of the regression parameter β is defined as the minimizer of the opposite of the Cox partial log-likelihood under an L¹-norm type constraint.
There has been theoretical progress on this topic recently.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...