Non-parametric estimation of sur vival probabilities with a time-dependent exposure switch: application to (simulated) heart transplant data

Background: To summarize the survival experience of patients waiting for heart transplant and to compare it with the post-transplant survival it is not possible to use the Kaplan-Meier estimator considering the intervention status as fixed in time because of the well known “immortal time bias” issue. Methods: We reviewed and applied to a simulated dataset the available methods to perform a non-parametric analysis accounting for the time-varying nature of the transplant status. Specifically we considered the Simon-Makuch estimator and the recently proposed “clock-back” estimator. Results: We showed that the Simon-Makuch estimator for the survival of patients on list is unbiased but the corresponding estimator of the post-transplant survival is not reliable for non-markov contexts like the one considered. Instead, if the semi-Markov assumption could be postulated (the post-transplant mortality depends mainly on the time since transplant and not on the waiting time to the intervention), the "clock-back" estimator produces valid results. Conclusion: We enlightened the importance of testing the process memory assumptions (e.g. Markov properties) in order to choose the approach more reliable. Moreover, we recommend the use of the Simon-Makuch method to study the survival of patients before the intervention and the use of the "clock back" estimator for the post-intervention survival in semi-markovian contexts.


INTRODUCTION
Heart transplant (HTx) is often the only treatment option for patients affected by end-stage cardiomyopathy with a short life expectancy.Due to the large gap between demand and supply, selected candidates are added on a waiting list until a compatible functioning organ becomes available.In this context, an interesting issue could be to estimate the expected survival outcome while waiting on list and that after receiving HTx.Besides the impact of clinical and demographic characteristics, the Kaplan-Meier (KM) estimator is usually employed to summarize the overall survival experience of a homogeneous cohort [1,2].In this case, however, the application of such method is complicated by the time-dependent nature of the treatment variable: all patients are HTx-free at the beginning (i.e. at the time of entry in the list) and they may change their treatment status after some waiting time.As a consequence, an analysis that considers a time-fixed indicator to classify subjects as transplanted or not would be affected by the well-known "time-dependent bias" [3].In fact, the time to death for a HTx recipient is certainly greater than his/her waiting time to HTx ("immortal time bias") while subjects in list with short survival times have more chances to be included in the non-transplanted group since they may not survive long enough to receive HTx ("mortal time bias") [4].Apparently, the problem could be tackled from the multi-state models point of view [8].Patients are involved in a three-state process (Figure 1) where, from the initial state 0 (being on the waiting list), one can possibly move to the final state 2 (death) either directly or by transiting through to the intermediate state 1 (HTx).The multi-state models framework, which represents an extension of the competing risks methodology, allows to predict the probability of each state occupation in time, using some model based estimate of the transition intensities plugged into the Aalen-Johansen estimator for the crude incidence [5].Results provide a nice description of the process but since it is not possible to get rid of the dynamic proportion of patients receiving HTx in time, the method does not allow to directly see the impact of HTx on survival.Historically, the first attempt to address this issue was the "landmark" method which was proposed by Anderson et al. [6] to compare responders and non-responders to some treatment in a situation where patients may achieve the response after some time since therapy start.The idea, with reference to our example, is to set a fixed time point, called "landmark", and classify patient according to their HTx status at that time point.Avoidance of the timedependent bias is counteracted by several drawbacks.First, events and censored observations before landmark are excluded from the analysis causing a loss of efficiency and introducing some selection bias.Starting the curves from landmark has an impact also on the interpretation of the theoretical quantities they represent; in fact, such estimates are now related to a survival probability which is conditional on being failure free at landmark.Moreover, the resulting curves may strongly depend on the choice of the landmark point which generally has to be set based on a trade-off between clinical meaning and empirical motivation (i.e.ensuring a decent amount of patients at risk within groups).Besides these limits, the method has been extensively applied in the cardiovascular literature [7].
An extension of this method was proposed by Simon and Makuch [9] consisting in using a time-updated indicator of treatment status.In practice a patient contributes to the hazard estimation involved in the KM formula for the no HTx group but her/his observation is censored as she/ he changes treatment condition.Hereafter, the survival contribution of this patient is attributed to HTx and left truncation is adopted to account for late entry in the new "at risk" group.The landmark point is still needed only in situations where there are no (or few) patients in the new treatment group at the beginning but can be set as early as possible compatibly with the group size.In our context of interest it happens that for some patient the new organ is immediately available thus allowing for a non-empty HTx group even at time zero.The interpretation of the theoretical quantities behind this estimator has been debated [10].In a recent paper [11] we showed, also through a simulation protocol, that: the method always provides unbiased estimates of the survival probability while waiting in list; concerning HTx group, however, results can be properly interpreted only in situations where the Markov assumption is reasonable.Moreover, we proposed a different approach based on the "clock-back" time scale that is valid under the semi-Markov assumption which is more plausible in many contexts including the case of HTx.The aim of this work is to review the KM estimators according to the Simon-Makuch and clock-back versions and to apply these methods to a simulated data set inspired by real data on HTx.Non-parametric estimation of survival probabilities with time-dependent exposure switch application to (simulated) heart transplant data

Notation, assumptions and data simulation
Figure 1 represents the three-state event process on which the motivating example is based.The random variables involved in the process are denoted as follows: • T 02 : potential time from entry in the list to death while still on list; • T 01 : potential time from entry in the list to HTx; • T 12 : potential time from HTx to death; • T 012 = T 01 + T 12 : potential time from entry in the list to death after HTx; • X = I(T 01 ≤ T 02 ): indicator of the HTx status so that T 02 is observable only when X = 0 while T 01 , T 0 and T 12 are observable only when X = 1; For each patient i = 1;...; n, we observe the following data: • t i : time from entry in the list to death or right censoring either without or after HTx; • ∂ i : death (∂ i = 1) or censoring (∂ i = 0) indicator; • ℇ i : indicator of the HTx status (ℇ i = 1 if the patient is transplanted and ℇ i = 0 otherwise); • T 01 i : time from entry in the list to HTx, observed only if ℇ i = 1.The vector of observed data (t i ;∂ i ;ℇ i ;ℇ i .t 01 i ) is the sampling realization of (Z;∆;E;E .T 01 ).
Finally, we define: • t j ; j = 1; ...; J: ordered observed times from entry in the list; • s j ; j = 1; ...; J s : ordered observed times from HTx (only for patients with ℇ i = 1).Our goal is to obtain unbiased estimates of two quantities.The first is the survival probability of a patient on list as if she/he will never receive HTx: (1) Of note, this quantity is guided only by the hazard of dying while on list, dened as λ 02 (t) = lim ∆t→0 P(t < T 02 < t + ∆t|T 02 > t)/∆t.The second quantity is the survival probability of a patient that receives HTx immediately after entering on list: (2) The notation T 01 = 0 is used to indicate that the survival probability refers to a patient whose waiting time on list is nearly null.Of note, this quantity is guided only by the hazard of dying after HTx, which in general may also defined on the waiting time T 01 .For the case where T 01 = 0 it is dened as λ 12 (t; T 01 = 0) = lim ∆t→0 P(t < T 012 < t + ∆t|T 012 > t)/∆t.
With reference to the motivating example, we make the following assumptions: 1. C⊥T: this is the standard independent censoring assumption; 2. T 02 ⊥T 01 : among candidate patients to HTx with high priority we can assume that the time until HTx corresponds to the time to find a compatible donor and thus it does not depend on the potential time to death while waiting on list; 3. "semi-Markov property": we assume that the hazard after HTx depends only on T 12 and not on T 01 , in other words we assume that the time spent waiting on list does not affect the mortality after HTx.In fact, it is reasonable to think that the time to death after HTx depends largely on the intervention itself rather than on the length of the waiting period, even if the patient condition is changed during this period; Of note, the last assumption is different from the "Markov property" which would imply that post-transplant mortality depends on time since entry in list (T 012 ) but neither on the waiting time to HTx T 01 nor on the time after HTx T 12 .Although often implausible, as in this case, this is a common assumption in the multi-state models framework which is adopted to ease the development of proper estimators [12].
The analyzed data consists of 500 observations, simulated using the inversion method of Bender et al. [13] and according to the above assumptions.The potential times T 02 and T 12 were generated using a Weibull distribution with scale and shape parameters (0.8, 0.4) and (0.3, 0.5), respectively.The potential time T 01 was generated using an Exponential distribution with parameter (2).The theoretical hazard and survival functions corresponding to the potential times are shown in figure 2A-B.Finally, the censoring time was generated from a Uniform distribution with range (0.25,8).In the simulated scenario, the amount of observations with E = 1 (i.e.transplanted patients) was 298 (59.6%) while the total number of events (i.e.deaths) was 309.
The R code used for data generation and to perform the analyses presented is available as a supplementary file.

Kaplan-Meier estimator with time-fixed HTx indicator
The KM estimator of the survival while waiting on list that considers the HTx status as time-fixed and known from the beginning of the follow-up is defined as: (3) The hazard in ( 3) is generally a biased estimator of λ 02 (t) because it involves patients that at time t must satisfy the condition (Z ≤T 01 ), on top of conditions (T 02 > t) (which is the only required by λ 02 (t)) and (C > t) (which can be avoided due assumption 1.).Consequently, estimator (3) is biased with respect to (1).
The analogous KM estimator of the survival related to HTx is: The hazard in ( 4) is generally a biased estimator of λ 12 (t; T 01 = 0) because it involves patients that at time t must satisfy the condition (Z > T 01 ), on top of conditions (T 012 > t) (which is the only required by λ 12 (t; T 01 = 0)) and (C > t) (which can be avoided due to assumption 1.).Consequently, estimator ( 4) is biased with respect to (2).

Simon-Makuch estimator
The estimator of the survival while waiting on list according to the Simon and Makuch method is defined hereby: (5) In practice, all patients not yet transplanted until time t contribute to the estimation.Patients receiving HTx are censored at their time of transplantation t 01 i .The hazard in ( 5) is an unbiased estimator of = lim ∆t→0 P(t < T 02 < t + ∆ t |(T 02 > t) ∩ (T 01 > t) ∩ (C > t))/∆ t , but due to assumptions 1. and 2. the conditions (C > t) and (T 01 > t), respectively, can be avoided.
The estimator of the survival after an immediate HTx according to the Simon and Makuch method is the following: (6) In practice, as patients receive HTx they start to contribute to the estimation.Computationally, their time of transplantation t 01 i is used as left-truncation.The hazard in ( 6) is an unbiased estimator of = lim ∆t→0 P(t < T 012 < t + ∆ t |(T 012 > t) ∩ (T 01 ≤ t) ∩ (T 01 ≤ T 01 ) ∩ (C > t))/∆ t .
Conditions (C > t) and (T 01 T 02 ) can be avoided thanks to assumptions 1. and 2., respectively, but the equality = λ 12 (t; T 01 = 0) would be valid only under the Markov assumption (the hazard would not depend on T 01 and thus also condition (T 01 ≤ t) could be avoided).However, in a semi-markovian context like that considered here, estimator ( 6) is biased with respect to (2).

Clock-back estimator of survival after HTx
The estimator of the survival after an immediate HTx according to the "clock-back" approach is the following: (7) In practice, the observed death/censoring times of transplanted patients are re-scaled by subtracting their waiting time to HTx T 01 i .The hazard in ( 7) is an unbiased estimator of = lim ∆t→0 P(t < T 12 < t + ∆ t |(T 12 > t) ∩ (C > t))/∆ t As usual, condition (C > t) can be avoided thanks to assumption 1.Moreover, due to assumption 3., the hazard under HTx administered at time T 01 = 0 is equal to the hazard under HTx administered at any time T 01 .Consequently, = λ 12 (t; T 01 = 0) and estimator ( 7) is unbiased with respect to (2).

Check the process memory assumptions
To check whether the data satisfy either the Markov, semi-Markov or none of the two assumptions (the latter case is often called "extended semi-markovian" process) we should study the impact of the waiting time T 01 on the hazard of death after HTx [14].If the process is markovian, the hazard of two patients at time t since entering on list is the same even if their T 01 is different.
Thus, we can fit a Cox model only on transplanted patients using their original death/censoring times and using the waiting time T 01 as left-truncation to account for the possible delayed entry in the "at risk" group.We include in the model T 01 as the only covariate.On our data, the resulting effect (i.e.hazard ratio, HR) is equal to 2.2 (95% CI: 1.3;3.7,p=0.003), thus suggesting a violation of the Markov assumption.Of note, this result is valid to the extent that the simple proportional hazard model is reasonable, otherwise a model including a timevarying coefficient is more appropriate.If the process is semi-markovian, the hazard of two patients at time s since HTx is the same even if their T 01 is different.Thus, we can fit a Cox model only on transplanted patients using their re-scaled death/censoring times by subtracting the waiting time T 01 .Again, we include in the model T 01 as the only covariate.On our data, the resulting effect is HR=1.1 (95% CI: 0. Non-parametric estimation of survival probabilities with time-dependent exposure switch application to (simulated) heart transplant data semi-Markov assumption is compatible with our data.Of course, this was expected since data were generated according to a semi-Markov process.

Kaplan-Meier estimator with time-fixed HTx indicator
The survival curves estimated by the KM analysis considering the HTx status fixed and known from origin are depicted in Figure 3.Only patients that are never observed to be transplanted, either because they die or their observation is censored before their potential time to HTx, contribute to the estimation of the solid line curve (according to equation ( 3)).The survival contribution while on list of patients who will receive HTx is ignored causing an upward bias with respect to the true hazard of death while waiting HTx and consequently a downward bias with respect to the corresponding survival.In practice, only patients with short potential death times while on list tend to be included in the estimation because patients with long potential times more likely reach the time until an available organ is found.In contrast, the patients contributing to the estimation of the dashed line curve (according to equation ( 4)) are only those who are observed to be transplanted and even their survival period spent on list is artificially considered as a post-HTx time.This causes a downward bias with respect to the true hazard of death after HTx and consequently an upward bias with respect to the corresponding survival.In fact, patients actually receiving HTx have to survive while on list for a period at least equal to their potential time to HTx, but this "immortal time period" is erroneously attributed to HTx.

Analysis with time-varying HTx indicator
The analysis based on product-limit estimators accounting for the possible shift of the HTx status in time is represented by Figure 4.The solid line curve is the result of the Simon-Makuch approach to estimate the survival on list (equation ( 5)).The estimator involves both the whole contribution of patients never reaching HTx and the pre-HTx contribution of transplanted patients and, as long as the independence between the potential time to death on list and the potential time to HTx can be assumed, it is unbiased with respect to the true survival on list.
The Simon-Makuch approach for the survival after HTx (equation ( 6)) resulted in a biased curve (dash-dot line),  lower than the true one.This is due to the non-markovian nature of the data generating process.In practice, at each time point t the estimator attempts to calculate the hazard of death of an hypothetical patient immediately transplanted after waiting list entry by using the observation at time t since entry on list of patients transplanted at any time before t.However, in a non-markov setting, patients transplanted at different times are generally exposed to different mortality rates at time t, possibly because of the impact of time to HTx and/or because of the different period length spent after HTx.In our context, the true post-HTx hazard of death is a decreasing function (Figure 2A, dashed line) thus, at time t since entry on list, the longer is the time to HTx, the shorter is the time since HTx and the higher is the value of the mortality rate.This results in a overestimation of the true hazard of death of a patient immediately transplanted after waiting list entry with a corresponding underestimation of the survival probability.The dashed line curve obtained applying the "clock-back" estimator (equation ( 7)) lies very close to the true survival after HTx function.This estimator involves a calculation of the hazard of death for a patient immediately transplanted after waiting list entry that uses the observation of all transplanted patients at the same time point t since HTx.Adopting this "clock-back" time scale, the impact of the time since HTx on the mortality rate is automatically accounted for.Moreover, due to the semimarkovian nature of the data generating process, the time to HTx does not affect the post-transplant mortality thus the contributions of all transplanted patients, even with different waiting times, can be involved in the estimation.

DISCUSSION
In this paper we showed how to summarize, using product limit estimators, the survival experience of patients included in a HTx waiting list before and after the intervention.In this setting, the challenging issue is due to the time-varying nature of the HTx administration which generally depends mainly on the availability of a compatible organ.Data were simulated allowing to check the performance of the methods applied.We showed that the naive KM estimator that ignores the possible shift in the HTx status and classify patients either as not transplanted or as transplanted from the beginning of the follow-up leads to biased results.This problem, known as "immortal time bias" is very well known among statisticians but it was found several times in the medical literature [15,16,17,18,19,20,21].To address this issue we analyzed data accounting for the time-varying HTx status.
First, we used the Simon-Makuch method to estimate the survival of patients while on list showing that it produces reliable results.Subsequently, we focused on the post-HTx survival showing that, in the present context, it leads to downward biased results.In fact, as explained in [11], the Simon-Makuch estimator for the survival after the intervention produces valid results only within a Markov data generating process, where the hazard of death is inuenced by the time from origin (here the entry on list) but not by the waiting time to the intervention neither by the time since the intervention was administered.In many situations, including the one we considered, this assumption is not plausible.More likely, the hazard of death after HTx strongly depends on the time since HTx but not (or very weakly) from the waiting time until a compatible organ is found.This situation describes a semi-markovian scenario and valid estimation of the post-HTx survival is possible using the "clock back" method.This estimator is based on the clock back time scale meaning that for transplanted patients the time is measured starting from HTx.
The resulting survival curve represents the probability of remaining alive for an hypothetical patients transplanted immediately after entering on list but, due to the negligible affect of the waiting time to HTx, it could represent the survival probability of a patient transplanted at any time.Obviously, in a more complex situation where both the time since the intervention and the waiting time to the intervention have an impact (i.e.extended semi-Markov process) even the "clock-back" estimator is no longer correct.However, it is possible to check both the Markov and the semi-Markov assumptions from the data using for example an hazard model (e.g.Cox, Poisson regression) applied on the sub-sample of patients receiving the intervention.Beside the common assumption of independent censoring, another crucial (untestable) property must hold in order to obtain reliable results by applying the Simon-Makuch and "clock-back" estimators.Namely, the potential time to death before the intervention should not be related to the potential waiting time.In a context of a homogeneous cohort of highly cardiopathic patients, the time to HTx depends mainly on the availability of a compatible organ and it is likely not related to the condition of the patients while on list.
In conclusion, we reviewed and applied nonparametric methods for the analysis of the survival of patients included in a waiting list for receiving HTx.We enlightened the importance of testing the process memory assumptions (e.g.Markov properties) in order to choose the approach more reliable.In particular, we recommend the use of the Simon-Makuch method to study the survival of patients before the intervention and the use of the "clock back" estimator for the post-intervention survival in semi-markovian contexts where indeed the mortality rate depends on the time since intervention and the impact of the waiting time is negligible.

FIGURE 1 .
FIGURE 1. Multi-state representation of the basic event process involving patients candidate to receive heart transplant.The potential times to transition between states are also shown.
Figure1represents the three-state event process on which the motivating example is based.The random variables involved in the process are denoted as follows:• T 02 : potential time from entry in the list to death while still on list; • T 01 : potential time from entry in the list to HTx; • T 12 : potential time from HTx to death;• T 012 = T 01 + T 12 : potential time from entry in the list to death after HTx; • X = I(T 01 ≤ T 02 ): indicator of the HTx status so that T 02 is observable only when X = 0 while T 01 , T 0 and T 12 are observable only when X = 1;• T = (1-X) x T 02 + X xT 012 : time from entry in the list to death either without or after HTx; • C: time from entry in the list to right censoring; • Z= min(T;C): observable time to death or right censoring; • ∆= I(T ≤ C): observable death or censoring indicator; • E= I(T 01 Z): observable indicator of the HTx status.For each patient i = 1;...; n, we observe the following data:• t i : time from entry in the list to death or right censoring either without or after HTx; • ∂ i : death (∂ i = 1) or censoring (∂ i = 0) indicator; • ℇ i : indicator of the HTx status (ℇ i = 1 if the patient is transplanted and ℇ i = 0 otherwise); • T01 7;1.8, p=0.731), thus suggesting that the Epidemiology Biostatistics and Public Health -2018, Volume 15, Number 3

FIGURE 2 .
FIGURE 2. Hazard (A) and survival (B) functions of the generated potential times of transition between states.

FIGURE 3 .
FIGURE 3. Survival curves according to the Kaplan-Meier estimator with time-fixed HTx indicator.The true potential survival curves are superimposed (gray lines).

FIGURE 4 .
FIGURE 4. Survival curves according to the Simon-Makuch and "clock-back" estimators with time-varying HTx indicator.The true potential survival curves are superimposed (gray lines).