Comparison of multi-state Markov models for cancer progression with different procedures for parameters estimation . An application to breast cancer

Background: the knowledge of sojourn time (the duration of the preclinical screen-detectable period) and screening test sensitivity is crucial for understanding the disease progression and the effectiveness of screening programmes. For this purpose a model of the natural history of the disease is needed. The aim of this work is to provide an illustration of the application of multistate Markov models for breast cancer progression to the data of the Florentine screening programme, in order to estimate the sojourn time and sensitivity for breast cancer screening. MeThods: three different multi-state Markov models of increasing complexity were used with three different estimation procedures based on non-linear least squares, maximum likelihood, and on a Bayesian approach. all the models produced estimates for screening sensitivity and mean sojourn time. The data used in our application seem to lead to a non-identifiability problem, since the estimation procedures for both the Maximum Likelihood and non-Linear Least squares gave estimates that changed with the parameters’ initial values or difficultly converged. In order to take this problem into account we used the Bayesian approach by incorporating prior information on all the parameters. resuLTs: the mean sojourn time varied between 2-7 years and 3-5 years for women aged 50-59 and 60-69, respectively. When the model complexity was increased a higher variability in estimates was observed among the estimation procedures. The results of the screening sensitivity estimates were highly variable, both among estimation techniques and models varying between 63% and 100%, and between 77% and 100% for women aged 50-59 and 60-69, respectively. concLusIons: results are in accord with the literature; those obtained through the Bayesian approach seem to be more reliable.


InTroducTIon
Early detection by mass screening programmes is of major public health importance for control of at least some kinds of progressive chronic disease.By advancing the diagnosis of disease, the hope is to reduce the risk of serious consequences, especially death.For cancer, the definitive demonstration that early detection is of benefit is a reduction in disease-specific mortality in a screened group.
To properly define the more appropriate screening intervals and understand the effect of screening activities on the incidence of cancer, a model of the natural history of the disease is needed.
Following Day et al [1], we assume that the disease progresses as shown in Figure 1.If no intervention were to take place, an individual would enter the preclinical phase of the disease detectable by screening at time T 0 and would begin to manifest symptoms, i.e. the disease would become clinically apparent at time T 1 .For the individual, the sojourn time is defined as T 1 -T 0 .Suppose that a person is screened at time T 2 (T 0 <T 2 <T 1 ) and the preclinical stage is diagnosed.For the individual, the lead time -the interval by which diagnosis is brought forward -is defined as T 1 -T 2 .The probability that the screening test correctly identifies an individual as being in the preclinical phase is termed the "sensitivity" (Se) of the test.
The time that a disease remains in the preclinical detectable phase (PCDP), the sojourn time, is important in determining the potential effectiveness of a screening programme.It provides an absolute upper limit to the lead time obtainable and indicates what interval between screens is likely to achieve a useful lead time.If the sojourn time is long, the maximum attainable lead time is correspondingly long.If the sojourn time is short, however, the potential gains from screening are smaller and screening should take place more frequently in order to increase the probability that preclinical disease is found before it surfaces to the clinical stage.Furthermore, lead time estimates are needed to properly interpret changes in cancer incidence after the start of a screening program and, particularly, to evaluate the presence of overdiagnosis.
Different models have been developed to estimate mean sojourn time (MST) and Se, for breast as well as for other cancer screening programmes.
Day and Walter [1] proposed a simplified model to estimate both the MST and the test Se for breast cancer screening and showed that the sojourn time distribution is well approximated by an exponential distribution.The exponential distribution implies that the average advance in diagnosis due to screening is equal to the MST and that the standard deviation of the sojourn time is equal to its mean.
Using generalized linear models, Paci and Duffy [2] used information on interval cancers between screens by applying an exponential form for the sojourn time distribution, to estimate MST, Se and positive predictive value based on the Florence breast cancer screening programme.Multi-state Markov models -which consider more states in modeling the natural history of the disease -allow for the specification of the transition rates from preclinical to clinical states through different pathways (dimension, nodes involvement, non-progressive disease).
Uhry et al [6] applied multi-state Markov models to estimate natural history parameters that were applied to data from breast screening programmes in three districts in France.Using the node status as a prognostic factor, they applied a multi-state Markov model in order to estimate the Se of the screening procedure and the MST, and to predict the mortality reduction associated with screening, To explore the effect of overdiagnosis on parameter estimates, they added a non-progressive preclinical state.
Using data from Finnish screening program, Wu et al [7] presented a refinement of the multistate Markov model to estimate the progression rate of breast cancer, making allowance for measurement errors and for different methods to detect breast cancer including cancers arising from non-participants.
The aim of this study is to provide an illustration of the application of multi-state Markov models for breast cancer progression to data from the first two rounds of the Florentine screening programme (1991)(1992)(1993).Three different multi-state Markov models that are widely used in the literature were applied, and three different estimation procedures (non-linear least squares, maximum likelihood, Bayesian approach) were used.Limits and advantages of these different methods are discussed.

MeThods
We modeled the natural history of breast cancer -describing the progression of the tumour from the asymptomatic phase to the clinical phase -according to three different multi-state Markov models [3, 4, 5]: two, three, and five-state models.
The Markov model provides a convenient way to model prognosis for clinical problems.The model assumes that the patient is always in one of a finite number of states of health, which are termed Markov states.All events of interest are modeled as transitions from one state to another.
In this paper, multi-state Markov stochastic processes which we apply are defined by the following properties: • The transition rates between the states are invariant with time.

Two-state Markov model
According to Prevost et al [3], we defined a twostate model for the natural history of the disease where the states are preclinical breast cancer that are detectable by screening (asymptomatic) and clinical breast cancer (symptomatic), and λ 12 is the transition rate from the first state to the other (Figure 2).The inverse of the transition rate, 1/λ 12 , is the MST under the exponential distribution.[6] we suppose that the natural history of the disease is described by three health states: no detectable breast cancer, preclinical cancer detectable by screening (asymptomatic) and clinical breast cancer (symptomatic) (Figure 3).The transition rate λ 01 represents the incidence rate of the preclinical disease, and λ 12 is the transition rate from asymptomatic to symptomatic breast cancer.As in the two-state model, the MST is equal to 1/λ 12 under the exponential assumption.The model is presented in detail in the Supplementary Materials.

Five-state Markov model:
The five-state model differentiates the progression of a tumour according to node status, i.e. node negative (PN-), and node positive (PN+).As a consequence, the overall MST in the preclinical phase can be calculated by the sum of the MST in the preclinical phase for PN-(state 1) and MST in the preclinical phase for PN+ (state 2) multiplied by the proportion of cancer moving from state 1 to state 2 (Figure 4).

estimation procedures
For each of the models described, the parameters were estimated using three different techniques: maximum likelihood (ML), non-linear least squares (NLS), and a Bayesian approach (BA) using the Gibbs sampling technique.
ML and NLS are among the most frequently used procedures to estimate the parameters of a large variety of statistical models, including multi-state Markov models.
The likelihood function is the joint probability (density) function of observed data expressed as the function of the unknown parameters.The idea behind the maximum likelihood parameter estimation is to determine the parameters that maximize the probability of  the sample data (likelihood).The ML estimation finds the most "likely" values of the distribution parameters for a set of data.The likelihood derivation for the Markov models under study is based on assumptions regarding the densities of the available data (Supplementary Materials).
NLS regression extends the linear least squares regression in order to be used for a much larger and more general class of functions.Almost any function that can be written in closed form can be incorporated in a nonlinear regression model.The idea of using NLS technique to estimate the Markov model parameters is to calculate the observed and expected transitions, as well as to treat the equation (observed = expected + error) as a non-linear regression equation, with regression coefficients for screening Se and transition parameters to be estimated [4].
Both the NLS and ML procedures generally perform quite well, but sometimes a nonidentifiability problem can occur, i.e. the impossibility of the model parameters to be uniquely determined from the distribution of the observed random variables.The latter may occur if the model does not have sufficient information to estimate all the parameters of interest.This problem may depend on the specific data used to inform the model.The non-identifiability problem may be overcome by adding external information, e.g.data from non-participants [6, 7], by further simplifying the model structure in order to reduce the number of parameters to be estimated, or by incorporating prior information in a Bayesian framework [8,9].
The data used in our application seem to lead to a non-identifiability problem, since both the ML and NLS estimation procedures gave estimates that changed with the parameters' initial values or difficultly converged.In order to take this problem into account, this problem we used the BA by incorporating prior information on all the parameters.
In a Bayesian framework, extra information may be added to the analysis by combining the data -expressed through the likelihood function -with prior information on unknown parameters -expressed through suitable distributionsand thus deriving posterior distributions using the Bayes theorem.The posterior distribution contains updated beliefs about the values of the model parameters, after taking into account the information provided by the data.
Suitable prior distributions for the unknown parameters must be specified.In this analysis we used an uninformative flat prior Gamma(0.001,0.001) for the transition rates.The Gamma distribution is a distribution that is continuous, flexible, and constrained between [0,+∞), and is often used to model transition rates [3,10,11].The Beta distribution is a natural choice to represent the uncertainty on a probability being continuous and constrained in the interval (0,1) [12].Since estimates of Italian breast cancer screening Se are available from several studies [13-15], an informative Beta distribution was assumed as prior distribution for the test Se.The Beta parameters were derived with the method of moments [16] from the point estimates and standard errors from previous published studies [ [13][14][15].The selected parameters were 4.11 and 1.37 which corresponded to an average sensitivity around 75%.
The posterior distribution given the prior distributions, and data likelihood was computed by Gibbs sampling [8], providing random samples from marginal posterior densities for each parameter of interest.For the analyses presented, posterior inferences were based on 100 000 iterations of the Gibbs sampler after a burn-in of 50 000 iterations was discarded.Convergence was assessed by running multiple chains from dispersed starting values [17].
We carried out a sensitivity analysis on the prior distributions in order to evaluate the robustness of the results.In particular, both shape and inverse scale parameters of the Gamma distribution were changed from 0.1 to 0.0001.In the same way the shape parameters of the Beta distribution were changed from 1 to 5; the mean of the Se varied between 0.50 and 0.90.
We calculated the determination coefficient for the modeled outcomes from each model as a measure of goodness of fit comparing the observed number of cases to the estimated numbers [18].
The results for the ML and NLS presented above were used as initial values for the estimates from the BA.
We used the freeware program Just another Gibbs sampler (JAGS) [19] for the Bayesian modeling, and the R software for the ML and the NLS estimation procedures [20].

data
The observed data that were used to inform the three Markov models consist of e 8 7 2 9 -5 detection rates from different screening rounds and interval cancer rates.We used data from the cohort of 50-69 years-old women who participated in the first two screening rounds of the Florentine screening programme in the period from 1991-1993, and who were followed up for breast cancer up to two years after the second round [21,22].
The screening histories of all women from of the cohort were extracted from the local computerized screening database.Screening exposure was defined on the basis of attendance at the first and second rounds.
To identify the types of breast cancer that occurred within the cohort, a linkage with the Tuscan Cancer Registry [23] was made and the cases were classified according to: • screen detected cancers that occurred during the first round (prevalent cases) • screen detected cancers that occurred during the second round (incident cases) • interval cancers that occurred between the first and second round • interval cancers that occurred during the first year after the first round • interval cancers that occurred during the second year after the first round • interval cancers that occurred between the second round and the two subsequent years.Based on the TNM classification, all cases were stratified according to the lymph-node status (PN-, PN+).
Table 1 presents the data from the Florentine screening programme.Twenty-eight thousand three hundred and ninety women participated in the first round of screening with a detection rate of 7.2‰, and 23 445 women participated in the second round with the detection rate of 3.6‰; 32 interval cancers were detected between the first and the second round, and 31 during the two years after the second round.
The detection rate was higher in the age-class 60-69 than in those aged 50-59 both during the first and the second round.The occurrence of interval cancers was quite stable in the two intervals considered and similar across age-classes.

resuLTs
We performed the analyses for the three models using three different estimation procedures on the Florentine dataset stratified for the two age classes (50-59 and 60-69).The results from the two, three, and five-state models are reported in Tables 2, 3, and 4, respectively.
Table 2 presents the results from the twostate model.The model provides a very good fit with determination coefficients around 0.99 for both age classes.
The transition rate from asymptomatic to symptomatic cancer had a very similar result with the three estimation procedures.On the contrary, estimates of screening sensitivity showed meaningful differences among the estimation procedures, especially for the parameters estimated for the 50-59 age class with very high values (around 1) for the optimization procedures.In all of the estimation procedures, MST values were higher for older women.
Table 3 presents the results from the three-state model.This model also fits very well for both age classes with determination coefficients around 0.99.Although the transition rate estimates from no detectable to asymptomatic cancer had very similar results with the three estimation procedures and the two age classes, the transition rates from asymptomatic to symptomatic cancer resulted lower in the Bayesian estimation procedure with respect to the optimization ones.The resulting MST values were therefore 1 and 1.5 years higher in the Bayesian estimation procedure respectively for 50-59 and 60-69 age classes.
Results from the five-state model are shown in Table 4. Also in this case the model had a very good fit with determination coefficients around 0.97 and 0.98 for the 50-59 and 60-69 age classes, respectively.The transition rate from no detectable to asymptomatic PN-cancer had a very similar result in all the estimation procedures and in the two age classes.All the estimates (transition rates and Se) had a higher result using the ML procedure than did the BA and NLS for both age-classes.
The estimates of Se resulted around 100% for both age-classes when the ML procedure was used, while BA and NLS Se estimates were respectively 75% and 63% for ages 50-59, and 77% and 87% for ages 60-69.In all the procedures, sensitivity seems to increase with age.
The MST estimates from the three procedures MulTI-STATE MArkOV MOdElS fOr CANCEr PrOgrESSION three multi-state Markov models for the main purpose of estimating screening Se and MST.These estimates may further be used to make predictions for future trends of the burden of breast cancer and of the associated mortality after the introduction of screening.
The MST varied between 2 and 7 years and between 3 and 5 years for women aged 50-59 and 60-69, respectively.The two-state model gave the same MST estimates across the three estimation procedures for both age-classes.Alternatively, increasing the model complexity a higher variability in estimates was observed among the estimation procedures.
The screening Se varied between 63% and 100% and between 77% and 100% for women aged 50-59 and 60-69, respectively.The screening sensitivity estimates had a highly variable result, both among estimation techniques and models.
The three Markov models in combination with the three estimation procedures had very good fits.The latter was found in similar applications of Markov models [6,24].
All the models produced estimates for MST that were globally consistent with the literature [25,26], with a slight underestimation of MST for the older age-class in the two-state model, and an overestimation of the MST in the young ageclass for the NLS approach (Tables 2, 3, 4).The MST estimates showed an increase with ageing -a point that has already being raised in the The Se estimates from ML and NLS had a result in accord with the literature that reports estimates varying from 0.82 to 1.00 in 50-59 and from 0.88 to 1.00 in those aged 60-69 [2, 6, 7, 26 -28].NLS underestimates Se in the younger age-class.Estimates of the Se based on data from observed interval cancers in a screening programme showed a slightly lower value with respect to the model-based approach -from 70% to 82% [13][14][15].The BA estimates seem more plausible, perhaps because of the use of prior information on such parameters (Tables 2, 3, 4).
Increasing the model's complexity has the advantage of allowing us to describe the natural history of the disease of cancer with increasing detail, i.e. the five-state model has the advantage of estimating cancer cases by node status.However, as highlighted by Uhry et al. [6], it should be considered that the validity of some assumptions in the five-state model should be closely inspected.In particular, the MST in the two preclinical states (PN-and PN+) was assumed to be independent, whereas they might be positively correlated.The use of estimates derived from two separate dataset (screened and unscreened) has been proposed [6] as a way to avoid bias in prediction of the node status when predicting screening scenarios.
One of the several drawbacks of this kind of Markov model is that it aims to jointly estimate Se and MST that are highly correlated [6].In our application the BA overcomes this problem by strengthening the knowledge on screening Se with the integration of prior belief.By increasing the model complexity, the BA showed an Se estimate that comes closer to the data represented in the literature.
In addition, the BA gave robust estimates with all the models.Indeed, for all the models, sensitivity analyses confirmed the stability of posterior estimates over different values of the prior hyper-parameters.
In conclusion, we estimated both natural history parameters and Se using data from the two-yearly mammographic screening programme implemented in Florence, Italy.Results were in accordance with the literature, and the BA was identified as being more appropriate in this setting, given the non-identifiably problem arising from the ML and NLS techniques, and given the more reliable parameter estimates.
Contributorship: LV and GC provided the rationale for this study, designed and performed the statistical analysis.LV, GC and GM wrote the paper.DP mainly collaborated for data acquisition.All authors participated on the critical revision of the manuscript.

Our paper allows for a simultaneous comparison between multi-state Markov models of increasing complexity
and different estimation techniques to fit them.To our knowledge, this is rarely reported in the literature.

Limits and advantages of each estimation technique are
clarified and hopefully will help the readers to choose the procedure that is best suited to their data.

Funding: the research was partially supported by the Istituto Toscano Tumori, (grant proposal 2008).
disClosures: the Authors declare that there is no conflict of interest.
FIGURE 1 This model was modified by Prevost et al [3] and applied to colorectal cancer screening to take into account the possibility that the Se is substantially less than 100%, using a two states Markov model.More complex multi-state Markov models were successively proposed to assess the natural history of the disease [4, 5].

taBlE 3
RESUltS FoR thE thREE-StatE modEl wIth thE thREE EStImatIon pRocEdURES λ 01 : transition rate from no detectable cancer to asymptomatic cancer; λ 12 : transition rate from asymptomatic cancer to symptomatic cancer; Se: sensitivity of the screening test; 95%CI: 95% confidence interval (credible in the Bayesian estimation procedure); MST: mean sojourn time; BA: Bayesian approach; ML: maximum likelihood; NLS: non-linear least squared