Background

When working with survival data it is important to remember that the effects may change in time. This is rarely something you notice in datasets of a few hundreds but as the datasets grow larger the chances of this happening increase. When doing a simple Cox regression with the survival package this can be easily checked for using the cox.zph function.

The problem with non-proportional hazards

The main effect of non-proportional hazards is that you will have a mean estimate over the study time. This mean does not distribute evenly throughout the studied time period but evenly according to the observed time, i.e. if you have the longest observation period being 20 years while > 50% of your patients have a follow-up of less than 10 years the effect. Note that this may further depend on how the events are distributed. Thus it is useful to be able to deal with non-proportional hazards.

Using strata

The Cox model allows you to estimate effects in different strata and then average them together. If your confounder that breaks the proportionality assumption is a non-continuous variable this is a convenient method that allows you to set-up the necessary strata. With one variable this is simple, Surv(time, event) ~ x_1 + x_2 + strata(x_np), but with multiple variables you risk of ending up with small strata and the non-informative error:

attempt to select less than one element

Note that you should not multiple strata but combine the variables, e.g. if you have two variables called x_np1 and x_np2 you would set up your model with the strata as strata(x_np1, x_np2). The strata is then handled as interactions generating nlevels(x_np1) * nlevels(x_np2) which seems also to be the core reason for why this fails.

Using the tt argument

The survival package has an excellent vignette om time-dependent variables and time-dependent coefficients, see vignette("timedep", package = "survival"). Prof. Therneau explains there common pitfalls and how to use a time-transforming option provided by the coxph function through the tt argument. It is a neat an simple solution that transforms those variables that you have indicated for transformation using the tt function. The vignette provides some simple approaches but it also allows for a rather sophisticated use:

library(survival)
coxph(Surv(time, event) ~ age + sex +
type_of_surgery + tt(type_of_surgery) +
tt(surgery_length),
data = my_surgical_data,
tt = list(
function(type_of_surgery, time, ...){
# As type_of_surgery is a categorical variable
# we must transform it into a design matrix that
# we can use for multiplication with time
# Note: the [,-1] is for dropping the intercept
mtrx <- model.matrix(~type_of_surgery)[,-1]
mtrx * time
},
function(surgery_length, time, ...){
# Note that the surgery_length and the time
# should probably have similar ranges. If you
# report operating time in minutes, follow-up
# in years the t will be dwarfed by the
pspline(surgery_length + time)
}
))

The main problem is that it doesn’t scale all that well to larger datasets. A common error unless you have a large amount of memory is:

Could not allocate vector of *** MB

Using the timeSplitter

The tt approach is based upon the idea of time splitting. Time splitting is possible since the Cox proportional hazards model studies the derivative of the survival function, i.e. the hazard, and thus doesn’t care how many observations were present before the current derivative. This allows including patients/observations after 2 years by ignoring them prior to 2 years. The method is referred to as interval time where the Surv(time, event) simply changes into Surv(Start_time, Stop_time, event).

This allows us to adjust for time interactions as the Start_time is independent of the event. In our standard setting the Start_time is always 0 but if we split an observation into multiple time steps and use the interval time the variable will become meaningful in an interaction setting. Note that [!ref] suggested that one uses the End_time after splitting the observation time, while I’ve found that it in practice doesn’t matter that much - it makes intuitive sense to use the Start_time if we make the time interval too large the End_time will convey information about the event status and thereby by nature become significant. The approach explained in more detail below.

Poor-mans time split

An alternative is to split the data into a few intervals, select one interval at the time and perform separate models on each. This will result in multiple estimates per variable, poorer statistical power and most likely an incomplete handling of the non-proportional hazards. Despite these downsides it may still be a viable solution when presenting to a less statistically savvy audience in order to gain acceptance for above methods. The timeSplitter can help generating the datasets necessary, here’s a convenient example using dplyr:

library(plyr)
models <-
timeSplitter(your_data,
by = 4,
event_var = "status",
time_var = "years",
event_start_status = "alive",
time_related_vars = c("age", "date")) |>
dlply("Start_time",
function(data){
coxph(Surv(Start_time, End_time, status) ~ age + sex + treatment, data = data)
})

Other options

These approaches are probably just a subset of possible approaches. I know of the timereg package that has some very fancy time coefficient handling. My personal experience with the package is limited and I’ve been discouraged by the visually less appealing graphs provided in the examples and there isn’t a proper vignette explaining the details (Last checked v 1.8.9). Similarly the flexsurv should be able to deal with the proportional hazards assumption. If you know any other then please send me an e-mail.

How the timeSplitter works

First we generate a short survival dataset with 4 observations.

library(tidyverse)
library(Greg)

test_data <- tibble(id = 1:4,
time = c(4, 3.5, 1, 5),
age = c(62.2, 55.3, 73.7, 46.3),
date = as.Date(
c("2003-01-01",
"2010-04-01",
"2013-09-20",
"2002-02-23")),
stringsAsFactors = TRUE) |>
mutate(time_str = sprintf("0 to %.1f", time))

Using some grid-graphics we can illustrate the dataset graphically:

Now we apply a split that splits the data into 2 year chunks. Note: 2 years as in this example is probably not optimal, only chosen in order to make it easier to display.

library(dplyr)
split_data <- test_data |>
select(id, event, time, age, date) |>
timeSplitter(by = 2, # The time that we want to split by
event_var = "event",
time_var = "time",
event_start_status = "alive",
time_related_vars = c("age", "date"))

knitr::kable(head(split_data, 10))
id event age date Start_time Stop_time
1 alive 62.2 2002.999 0 2.0
1 censored 64.2 2004.999 2 4.0
2 alive 55.3 2010.246 0 2.0
2 dead 57.3 2012.246 2 3.5
3 alive 73.7 2013.718 0 1.0
4 alive 46.3 2002.145 0 2.0
4 alive 48.3 2004.145 2 4.0
4 dead 50.3 2006.145 4 5.0

Now if we plot each individual’s interval times below the original see multiple observation times where only the last observation time is related to the actual event. All prior are assumed to have unchanged event status from the original status.

A real example

I haven’t found any good datasets with non-proportional hazards but the melanoma dataset is largish and allows some exploration.

# First we start with loading the dataset
data("melanoma", package = "boot")

# Then we munge it according to ?boot::melanoma
melanoma <- mutate(melanoma,
status = factor(status,
levels = 1:3,
labels = c("Died from melanoma",
"Alive",
"Died from other causes")),
ulcer = factor(ulcer,
levels = 0:1,
labels = c("Absent", "Present")),
time = time/365.25, # All variables should be in the same time unit
sex = factor(sex,
levels = 0:1,
labels = c("Female", "Male")))

Now we can fit a regular cox regression:

library(survival)
regular_model <- coxph(Surv(time, status == "Died from melanoma") ~
age + sex + year + thickness + ulcer,
data = melanoma,
x = TRUE, y = TRUE)
summary(regular_model)
## Call:
## coxph(formula = Surv(time, status == "Died from melanoma") ~
##     age + sex + year + thickness + ulcer, data = melanoma, x = TRUE,
##     y = TRUE)
##
##   n= 205, number of events= 57
##
##                   coef exp(coef)  se(coef)      z Pr(>|z|)
## age           0.016805  1.016947  0.008578  1.959 0.050094 .
## sexMale       0.448121  1.565368  0.266861  1.679 0.093107 .
## year         -0.102566  0.902518  0.061007 -1.681 0.092719 .
## thickness     0.100312  1.105516  0.038212  2.625 0.008660 **
## ulcerPresent  1.194555  3.302087  0.309254  3.863 0.000112 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##              exp(coef) exp(-coef) lower .95 upper .95
## age             1.0169     0.9833    1.0000     1.034
## sexMale         1.5654     0.6388    0.9278     2.641
## year            0.9025     1.1080    0.8008     1.017
## thickness       1.1055     0.9046    1.0257     1.191
## ulcerPresent    3.3021     0.3028    1.8012     6.054
##
## Concordance= 0.757  (se = 0.031 )
## Likelihood ratio test= 44.4  on 5 df,   p=2e-08
## Wald test            = 40.89  on 5 df,   p=1e-07
## Score (logrank) test = 48.14  on 5 df,   p=3e-09

If we do the same with a split dataset:

spl_melanoma <-
melanoma |>
timeSplitter(by = .5,
event_var = "status",
event_start_status = "Alive",
time_var = "time",
time_related_vars = c("age", "year"))

interval_model <-
update(regular_model,
Surv(Start_time, Stop_time, status == "Died from melanoma") ~ .,
data = spl_melanoma)

summary(interval_model)
## Call:
## coxph(formula = Surv(Start_time, Stop_time, status == "Died from melanoma") ~
##     age + sex + year + thickness + ulcer, data = spl_melanoma,
##     x = TRUE, y = TRUE)
##
##   n= 2522, number of events= 57
##
##                   coef exp(coef)  se(coef)      z Pr(>|z|)
## age           0.016805  1.016947  0.008578  1.959 0.050094 .
## sexMale       0.448121  1.565368  0.266861  1.679 0.093107 .
## year         -0.102566  0.902518  0.061007 -1.681 0.092719 .
## thickness     0.100312  1.105516  0.038212  2.625 0.008660 **
## ulcerPresent  1.194555  3.302087  0.309254  3.863 0.000112 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##              exp(coef) exp(-coef) lower .95 upper .95
## age             1.0169     0.9833    1.0000     1.034
## sexMale         1.5654     0.6388    0.9278     2.641
## year            0.9025     1.1080    0.8008     1.017
## thickness       1.1055     0.9046    1.0257     1.191
## ulcerPresent    3.3021     0.3028    1.8012     6.054
##
## Concordance= 0.757  (se = 0.03 )
## Likelihood ratio test= 44.4  on 5 df,   p=2e-08
## Wald test            = 40.89  on 5 df,   p=1e-07
## Score (logrank) test = 48.14  on 5 df,   p=3e-09

As you can see the difference between the models is negligible:

library(htmlTable)
cbind(Regular = coef(regular_model),
Interval = coef(interval_model),
Difference = coef(regular_model) - coef(interval_model)) |>
txtRound(digits = 5) |>
knitr::kable(align = "r")
Regular Interval Difference
age 0.01681 0.01681 0.00000
sexMale 0.44812 0.44812 0.00000
year -0.10257 -0.10257 0.00000
thickness 0.10031 0.10031 0.00000
ulcerPresent 1.19455 1.19455 0.00000

Now we can look for time varying coefficients using the survival::cox.zph() function:

cox.zph(regular_model) |>
purrr::pluck("table") |>
txtRound(digits = 2) |>
knitr::kable(align = "r")
chisq df p
age 2.51 1.00 0.11
sex 1.50 1.00 0.22
year 1.26 1.00 0.26
thickness 4.40 1.00 0.04
ulcer 3.26 1.00 0.07
GLOBAL 9.97 5.00 0.08

The two variable that give a hint of time variation are age and thickness. It seems reasonable that melanoma thickness is less important as time increases, either the tumor was adequately removed or there was some remaining piece that caused the patient to die within a few years. We will therefore add a time interaction using the : variant (note using the * for interactions gives a separate variable for the time and that is not of interest in this case):

time_int_model <-
update(interval_model,
.~.+thickness:Start_time)
summary(time_int_model)
## Call:
## coxph(formula = Surv(Start_time, Stop_time, status == "Died from melanoma") ~
##     age + sex + year + thickness + ulcer + thickness:Start_time,
##     data = spl_melanoma, x = TRUE, y = TRUE)
##
##   n= 2522, number of events= 57
##
##                           coef exp(coef)  se(coef)      z Pr(>|z|)
## age                   0.014126  1.014226  0.008591  1.644 0.100115
## sexMale               0.511881  1.668427  0.268960  1.903 0.057016 .
## year                 -0.098459  0.906233  0.061241 -1.608 0.107896
## thickness             0.209025  1.232476  0.061820  3.381 0.000722 ***
## ulcerPresent          1.286214  3.619060  0.313838  4.098 4.16e-05 ***
## thickness:Start_time -0.045630  0.955395  0.022909 -1.992 0.046388 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##                      exp(coef) exp(-coef) lower .95 upper .95
## age                     1.0142     0.9860    0.9973    1.0314
## sexMale                 1.6684     0.5994    0.9848    2.8265
## year                    0.9062     1.1035    0.8037    1.0218
## thickness               1.2325     0.8114    1.0918    1.3912
## ulcerPresent            3.6191     0.2763    1.9564    6.6948
## thickness:Start_time    0.9554     1.0467    0.9134    0.9993
##
## Concordance= 0.762  (se = 0.03 )
## Likelihood ratio test= 48.96  on 6 df,   p=8e-09
## Wald test            = 45.28  on 6 df,   p=4e-08
## Score (logrank) test = 56.29  on 6 df,   p=3e-10

As suspected the thickness effect is reduced with time. A linear model is hard to explain from a biological standpoint, we may want to see if we can detect if the interaction follows a non-linear trajectory by adding a quadratic variable:

# First we need to manually add an interaction term
spl_melanoma <- mutate(spl_melanoma,
thickness_start = thickness * Start_time)
anova(time_int_model,
update(time_int_model, .~.+I(thickness_start^2)))
## Analysis of Deviance Table
##  Cox model: response is  Surv(Start_time, Stop_time, status == "Died from melanoma")
##  Model 1: ~ age + sex + year + thickness + ulcer + thickness:Start_time
##  Model 2: ~ age + sex + year + thickness + ulcer + I(thickness_start^2) + thickness:Start_time
##    loglik  Chisq Df Pr(>|Chi|)
## 1 -258.72
## 2 -258.51 0.4178  1      0.518

As you can see this doesn’t support that the variable is non-linear. An alternative would be to use the survival::pspline method:

update(time_int_model, .~.-thickness:Start_time+pspline(thickness_start))
## Call:
## coxph(formula = Surv(Start_time, Stop_time, status == "Died from melanoma") ~
##     age + sex + year + thickness + ulcer + pspline(thickness_start),
##     data = spl_melanoma, x = TRUE, y = TRUE)
##
##                               coef se(coef)      se2    Chisq   DF       p
## age                        0.01291  0.00857  0.00857  2.26993 1.00 0.13191
## sexMale                    0.49378  0.27138  0.27125  3.31055 1.00 0.06884
## year                      -0.09235  0.06070  0.06068  2.31457 1.00 0.12817
## thickness                  0.15334  0.07679  0.07525  3.98710 1.00 0.04585
## ulcerPresent               1.10814  0.32615  0.32381 11.54370 1.00 0.00068
## pspline(thickness_start), -0.04236  0.02321  0.02314  3.32997 1.00 0.06803
## pspline(thickness_start),                             4.19818 2.99 0.23948
##
## Iterations: 4 outer, 14 Newton-Raphson
##      Theta= 0.107
## Degrees of freedom for terms= 1 1 1 1 1 4
## Likelihood ratio test=54  on 8.93 df, p=2e-08
## n= 2522, number of events= 57

If you are only investigating confounders that you want to adjust for we are done. If you actually want to convey the results to your readers then we need to think about how to display the interaction, especially if they turn out to follow a non-linear pattern. If you have two continuous variables I you have basically two options, go with a 3-dimensional graph that where confidence interval are hard to illustrate or categorize one of the variables and use a regular 2-dimensional graph. I usually go for the latter:

# Lets create an evenly distributed categorical thickness variable
# and interactions
spl_melanoma <- mutate(spl_melanoma,
thickness_cat = cut(thickness,
breaks = c(0, 1, 5, Inf),
labels = c("less than 1.0",
"1.0 to 4.9",
"at least 5.0")))
# Now create interaction variables
for (l in levels(spl_melanoma$thickness_cat)[-1]) { spl_melanoma[[sprintf("thickness_%s_time", gsub(" ", "_", l))]] <- (spl_melanoma$thickness_cat == l)*spl_melanoma$Start_time } # Now for the model specification where we use a # pspline for the two interaction variables adv_int_model <- coxph(Surv(Start_time, Stop_time, status == "Died from melanoma") ~ age + sex + year + ulcer + thickness_cat + pspline(thickness_1.0_to_4.9_time) + pspline(thickness_at_least_5.0_time), data = spl_melanoma, x = TRUE, y = TRUE, iter.max = 1000) # To get the estimates we use the predict function new_data <- data.frame(thickness_cat = rep(levels(spl_melanoma$thickness_cat)[-1],
each = 100),
Start_time = 2^seq(-3, 3, length.out = 100),
stringsAsFactors = FALSE) |>
mutate(thickness_1.0_to_4.9_time = (thickness_cat == levels(spl_melanoma$thickness_cat)[2]) * Start_time, thickness_at_least_5.0_time = (thickness_cat == levels(spl_melanoma$thickness_cat)[3]) *
Start_time)
new_data$sex = "Female" new_data$age = median(melanoma$age) new_data$year = median(melanoma$year) new_data$ulcer = "Absent"

newdata = new_data,
type = "terms",
terms = c("thickness_cat",
"pspline(thickness_1.0_to_4.9_time)",
"pspline(thickness_at_least_5.0_time)"),
se.fit = TRUE)

new_data$fit <- rowSums(adv_pred$fit)
new_data$se.fit <- apply(adv_pred$se.fit, 1, function(x) x^2) |>
colSums() |>
sqrt()
new_data <- mutate(new_data,
risk = exp(fit),
upper = exp(fit + 1.96*se.fit),
lower = exp(fit - 1.96*se.fit))
library(ggplot2)
new_data |>
mutate(adapted_risk = sapply(risk, function(x) min(max(2^-4, x), 2^5)),
adapted_upper = sapply(upper, function(x) min(max(2^-4, x), 2^5)),
adapted_lower = sapply(lower, function(x) min(max(2^-4, x), 2^5))) |>
x = new_data\$Start_time,
group = thickness_cat,
col = thickness_cat,
fill = thickness_cat)) +
# The confidence intervals are too wide to display in this case
geom_line() +
scale_x_log10(breaks = 2^(-3:4),
limits = c(2^-3, 8),
expand = c(0, 0)) +
scale_y_log10(breaks = 2^(-4:5),
labels = txtRound(2^(-4:5), 2),
expand = c(0,0)) +
scale_color_brewer(type = "qual", guide = guide_legend(title = "Thickness (mm)")) +
ylab("Hazard ratio") +
xlab("Time (years)") +
theme_bw()

A few tips and notes

Drop unnecessary variables

The main problem is the memory usage with both the tt and the timeSplitter approach. Make therefore sure to drop all variables that you won’t be using before doing your regression. I’ve found that dropping variables not only limits the risk of running out of memory but also considerably speeds up the regressions.

Longer interval lengths will reduce the size of the split dataset but will increase the risk of residual non-proportional hazards. When I consulted a statistician on a dataset containing followup 0 to 21 years, she recommended that I have ½ year intervals. I think this was slightly overdoing it, I guess an alternative would have been to simply redo the cox.zph call in order to see how well the new model takes care of the non-proportionality problem.
Explaining the time coefficient can be demanding. I often rely on the rms::contrast function but this can be tricky since the Start_time can confuse the contrast function.
Warning: don’t use the I() option
Just to be crystal clear, the I() option should never be used. It will provide spuriously low p-values and doesn’t solve the non-proportionality issue. See Therneau’s vignette for more on this.