A Bayesian approach to correct for unmeasured or semi-unmeasured confounding in survival data using multiple validation data sets

Purpose: The existence of unmeasured confounding can clearly undermine the validity of an observational study. Methods of conducting sensitivity analyses to evaluate the impact of unmeasured confounding are well established. However, application of such methods to survival data (“time-to-event” outcomes) have received little attention in the literature. The purpose of this study is to propose a novel Bayesian method to account for unmeasured confounding for survival data. Methods: The Bayesian method is proposed under an assumption that the supplementary information on unmeasured confounding in the form of internal validation data, external validation data or expert elicited prior distributions is available. The method for incorporating such information to Cox proportional hazard model is described. Simulation studies are performed based on the recently published instrumental variable method to assess the impact of unmeasured confounding and to illustrate the improvement of the proposed method over the naïve model which ignores unmeasured confounding. Results: Simulation studies illustrate the impact of ignoring the unmeasured confounding and the effectiveness of our Bayesian approach. The corrected model had significantly less bias and coverage of 95% intervals much closer to nominal. Conclusion: The proposed Bayesian method provides a useful and flexible tool in incorporating different types of supplemental information on unmeasured confounding to adjust the treatment estimates when the outcome is survival data. It out-performed the naïve model in simulation studies based on a real world study.


INTRODUCTION
With the evolvement of electronic healthcare records, observational databases represent a rich source of routine clinical data which can potentially address a variety of medical questions.Due to lack of randomization, the use of such data for comparative effectiveness research is always challenged by selection bias [1][2][3].
In observational studies, it is critical to improve the study validity by using an appropriate comparator group to avoid potential bias or confounding in the study designs.Furthermore, some statistical methods, e.g.propensity score will help adjust for measured confounding factors, under an assumption of "no unmeasured confounding".If the critical assumption is violated, then under the same study population and design, conclusions may vary significantly depending on the different analytical approaches.In a recent comparative cardiovascular study [4], both the inverse propensity score weighting (IPW) method and the instrumental variable (IV) method were used.As is known to all, IPW requires the "no unmeasured confounding assumption", while IV method does not.Interestingly, the two well-established statistical methods provide inconsistent results.A possible explanation of these inconsistency is the existence of unmeasured confounding, as the authors pointed out "….In the TRANSLATE-ACS study, 2 confounding factors that are potentially incompletely measured in the data are frailty and coronary disease severity…" In our opinion, this study exemplifies the significant impact unmeasured confounding can have on survival data and illustrates the need for conducting sensitivity analysis to address this problem.
Literature has reported methods for adjusting unmeasured confounding in parametric regression models.Frequentist approaches [2,3,5] adjust for unmeasured confounding by using fixed values for the bias parameters (the parameters associated with the unmeasured confounding).By contrast, the Bayesian approach we describe uses of validation data, expert opinion, or a combination of both to build prior probability distributions for these parameters.Monte Carlo sensitivity analysis [6] is possible with frequentist approaches, but the Bayesian paradigm provides an operational approach to combine the different sources of information.For instance, a Bayesian approach to adjust logistic regression for unmeasured confounding with mildly informative priors has been considered [7].At the cost of inflated interval estimate widths, their use of relatively non-informative priors leads to only slight improvement in the bias of parameter estimates.This work was extended [8] by calibrating the priors for the regression parameters with internal validation data.That is, for a sub-sample of the data, the unmeasured confounder is ascertained.This allows for considerably better bias correction than the much less informative priors used in [7].These publications focused on Bayesian methods utilizing extra information for parametric regression models when the outcome is continuous, binary or skewed (e.g., cost data).
Unlike parametric regression models, semiparametric/non-parametric models with unmeasured confounding in comparative observational research have received little attention.In the frequentist paradigm, a simple adjustment using fixed values for the unmeasured confounding parameters in a Cox regression model has been considered [9].The propensity score calibration approach [3] can also be applied to the Cox model.Here, we extend related work in Bayesian hierarchical modeling [8,10,11] by using informative priors and/or validation data to adjust for unmeasured confounding in Cox proportional hazard model.
Our paper is organized as follows.In Section 2 we overview the model and prior distributions for analyzing survival data with an unmeasured confounder.We discuss three approaches to correct for unmeasured confounding: informative priors, internal validation data and external validation data.In Section 3 we report the results of simulation studies showing the effectiveness of the validation data approaches.A discussion along with the concluding remarks is given in Section 4.

METHODS
In observational studies, propensity score [12] can be used as a summary scale to represent multiple confounders and directly adjusted as a regressor in the regression modeling.Theoretically, the Bayesian hierarchical model is able to adjust multiple unmeasured confounders.However, in our simulation studies, we found it would take extreme long to obtain the adjusted estimates if multiple unmeasured confounders existed.Therefore, in our development, we focus on the case of a continuous unmeasured confounder range from 0 to 1, as such single continuous unmeasured confounder could be also viewed as propensity score which summarized multiple unmeasured confounders.We denote the time to the event by T i , the binary treatment indicator by x i , the p-dimensional vector of measured confounders by z i , and the unmeasured confounder by μ i .Without loss of generality we assume μ i ~ N(μ i ,σ).The mean of the continuous unmeasured confounder can be modeled as a function of both the treatment and the measured covariates, specifically, μ i = γ 0 + γ 1 x i + ∑ p (j=1) γ j+1 z ji .This model can be extended to incorporate multiple unmeasured confounders and multiple treatments in a fairly straightforward way.

Likelihood
We will focus on the simplified model μ i = γ 0 + γ 1 x i in the discussion below.It has been shown that the simplified model is actually the worst case scenario.This is because additional covariates may be related to unmeasured confounders, thereby reducing the latter's impact [13].
For the Cox proportional hazards model we follow the development in [14].Packages such as OpenBUGS and JAGS can be used for posterior computation.We assume the survival times are divided into intervals with end points a 1 <a 2 <...<a J <a (J+1) .The Cox proportional hazards mode is for scalar β and η, and a px1 vector of coefficients θ.The baseline hazard function, h 0 (t), is defined to be the step function Where the λj' S are positive scalars and I A is the indicator function defined on the set A. If J is large, this parametric function nicely approximates an arbitrary baseline hazard function [14].Simulation studies indicate J rarely needs to be larger than 10 for good performance.The likelihood for each subject is the joint distribution of the survival outcome and the distribution of the unmeasured confounder.The i th likelihood contribution is (2) where for the survival portion and d(i,j) is Poisson φ(i,j) with a value of either 0 or 1 depending on whether an event occurs or not.This is related to the so called "zeros trick" [15] to get nonstandard distributions.Note that the u i are being modeled but none are observed in the main study data of size n.The full likelihood is the product of the contributions in (2) across the sample size.Further discussion of the likelihood is provided in the appendix.
The coefficient β, which relates the treatment effect to the survival time, is the primary parameter of interest.Without additional information on γ 0 ,γ 1 and η the model is not identified.This extra information may come in the form of informative priors or validation data.For the latter, the unmeasured confounder is observed for a subset of the main study data or from a data source external to the main study.We will discuss the various approaches in the following subsections.

Prior distributions
We assume all regression parameters have normal distributions: The steps, λ j , in the baseline hazard function (1) are positive.Thus, we have selected gamma priors for them: λj ~ gamma(αj,κj ).
Finally, a prior is needed for the standard deviation of the unmeasured confounder, σ μ .Here we assume a conjugate prior, σ u 2 ~ inverse-gamma(α 0 ,b 0 ).If the parameters α 0 and b 0 are chosen to be too small then convergence problems may arise.In this situation, it has been suggested a uniform, half-normal, or half Cauchy prior for the standard deviation be used instead of the inverse-gamma on the variance [16].For our model, if validation data is not available, informative priors for γ 0 ,γ 1 , and σ u are required while diffuse priors can be used for β,θ k and λ j .An informative prior for η can be helpful but does not appear to be required for model identifiability.Eliciting priors for regression parameters is a non-trivial task.It is known that it is much easier to elicit on observable quantities as opposed to unobservable parameters [12].Combining the priors with the likelihood given in (1) yields the joint posterior.RJAGS code for this model is available in the online appendix.

Internal Validation
If the study researchers are able to ascertain the unmeasured confounder for a random subsample of the subjects, the resulting informative priors can help mitigate the problems of the unidentified model that results from such confounding.This type of data is called internal validation and has been used in numerous cases to correct for bias due to misclassification and measurement error [17] and for unmeasured confounding [8].For our case, suppose the total sample size is n+m where n observations have the unmeasured confounder and m have it observed.In general n is much larger than m and m/n is referred to as the validation sample fraction.The resulting likelihood is (3) where, as previously defined, we have φ(i,j)=λ j exp(βx i +θz i '+η(ũ i ) ω(i,j).For the n observations in the main study, the ũ i are completely missing.Hence the "~" distinguishes them from the values of u that are observed in the validation data.Since they are observed in the validation sample of size m, inference can proceed with either informative or diffuse priors.Combining the above likelihood with the priors of Section 2.1 yields the joint posterior.Sampling from this posterior can be accomplished with the RJAGS package in R and the code is available in the online appendix.
The advantage of internal validation data, as can be seen in the likelihood, is the data provides direct information on all parameters of interest using data from the target population.The main limitation is the feasibility of collecting additional data from study participants.The cost to obtain internal validation data may be too high.Contacting individuals whose data are routinely collected for administrative purposes may not be possible because of privacy concerns or laws.

External Validation
Since internal validation may be impossible or too expensive to obtain, another approach is to use external validation data.Examples of external validation data include previous studies conducted by the same researchers where subject-level data is likely to be available.Alternatively, published studies might be used where, for instance, information may be available in the form of summary statistics.If subject-level data is available, then the approach for external validation data is identical to that of internal validation and the likelihood given in (3) is used.If summary statistics from previous literature are available, an informative prior can be constructed in the following way.Assume the summary statistics are as in Table 1.Note this is the sufficient statistics for a linear regression with normal errors and a binary covariate.Our informative prior is constructed by first assuming flat priors on all parameters and constructing posteriors with the data in Table 1.These posteriors are then used as priors in the subsequent analyses.In this case, the prior for σ u is Conditional on σ u 2 , we have where and .These data-based priors are equivalent to the posteriors of the linear regression model with Jeffreys priors as derived in [18].Using a conjugate, non-informative inversegamma prior leads to the inverse gamma prior above, but as discussed earlier, if the validation sample size is small, a half-Cauchy prior is preferred.This loses some computational convenience, but is easily accommodated in standard software packages.
Finally, we note that in some cases, the external validation data may "swamp" the main study data, in which case it would be appropriate to down weight the data via a process such as found in [19].

RESULTS
We used simulated datasets to illustrate the performance of the proposed Bayesian model.The parameters of the simulation are based on the empirical data from [4].The publication is a secondary analysis of data from the Treatment With Adenosine Diphosphate Receptor Inhibitors-Longitudinal Assessment of Treatment Patterns and Events After Acute Coronary Syndrome (TRANSLATE-ACS) study.Included in the study were patients undergoing percutaneous coronary intervention for myocardial infarction and the study dates were April 4, 2010, to October 31, 2012.We used information on the key demographic variables age, gender and race, and the proportion of patients receiving different ADP inhibitors (Table 1, [4]) to inform our simulation setting.To be more specific, x i represents patients receiving different treatments, and z 1 ,z 2 ,z 3 correspond to age, gender and race, respectively.The unmeasured confounder was scaled to range from 0 to 1 to represent either a single unmeasured confounder or a propensity score summarizing multiple confounders.
For every simulation configuration we generated 100 data sets.We examined scenarios with internal and external validation data along with a situation that mimics informative priors.
The model used for the unmeasured confounder for the simulations was u i ~ N (0.36+0.1x 1i ,0.2) Note that only the treatment is included in the unmeasured confounding model, which as previously mentioned, is the worst case scenario in terms of bias.For the Cox model we used exp(-0.3xi +0.01z 1i +0.013z 2i +0.03z 3i +1.06u i ) Such simulation parameters were chosen so that the unmeasured confounder had at least moderate impact on the outcome.If the unmeasured confounder only had mild impact on the outcome, then ignoring it may not lead to significant bias in the treatment effect estimate.We assumed a main study sample size of 1000 and sample sizes for validation data of 50, 100, and 200.We generated an internal validation data set with all variables including the unmeasured confounder and an external validation data set where only summary statistics for the unmeasured confounder and exposure variable were obtained.Thus, no information for is provided for the external validation data.The informative priors were obtained by generating A Bayesian approach to correct for unmeasured or semi-unmeasured confounding in survival data using multiple validation data sets a single data set for each of the validation sample sizes and using this data to inform independent normal priors for each of the regression parameters.These priors were then used for every data set in the simulation as opposed to the validation data methods where a new validation data set was used each time.This was done to allow comparison of the methods with basically the same sample size equivalence but the informative prior is the same every time.We considered censoring rates of 60%, and 90%.
In the real data, the censoring was close to 90%.For the validation data scenarios we used fairly diffuse priors for all regression parameters, specifically, normal with mean 0 and standard deviation of 10.Relatively diffuse priors were used for σ u 2 and the λ j .For all analyses, we used the RJAGS package in R. Two chains were run with a burn-in of 7000 iterations which was followed by 20000 iterations keeping every 10 th for inference.Convergence was checked with Gelman-Rubin statistics and monitoring history plots.Sample plots are provided in the online appendix.
We provide results for the main parameter of interest, β, for 60% censoring in Table 2 and 90% censoring in Table 3.The results are the average bias along with empirical coverage and widths for 95% nominal intervals across the 100 simulated data sets for each different scenario.The bias was calculated as the absolute difference between the posterior mean and the true parameter value.Several interesting points are worth mentioning.The bias for each of the correction methods is smaller than the naïve model for all scenarios considered except one case with 90% censoring where the informative prior case is slightly larger.Coverage is close to nominal in most cases for the corrected models while the naïve model has coverage considerably below nominal for the 60% censoring case.All confidence intervals are fairly wide due to the large censoring rate (60% and 90%).Internal validation has consistently narrower widths than the external and informative cases.In fact, when the internal validation sample size is 200 (which is 20% of the total sample) the widths of the 95% intervals are very close to those in the naïve case.While validation data of size 50, or informative priors based on an equivalent sample size of 50 do seem to correct the bias and provide intervals with nominal coverage, we found that using validation data samples of less than 50 resulted in considerable convergence problems, indicating that perhaps validation  fractions (ratio of validation sample size to main study sample size) much below 5% are not particularly useful.
Another interesting point about the 90% censoring is that even though the naïve model is the most biased, it has coverage that is fairly close to nominal.This is because with high censoring the uncertainty in estimation is fairly high for all the models thus the 95% intervals are all wide and cover the truth most of the time even with the bias.We considered other parameter configurations (motivated by another real world example where unmeasured confounder was weakly associated with treatment choice but strongly impact the outcome), along with the case of a binomial unmeasured confounder and these simulations are described in the online appendix.In general, the results of the simulations with different parameter setting yields very consistent results as the results we presented here.

DISCUSSION
Unmeasured confounding continues to be a major problem in observational studies.New methods that correct for bias due to unmeasured confounding for more complicated models than linear and logistic regression are important areas of research.Several advanced statistical methods -both from frequentist and Bayesian perspectives -have been developed during the past decades in addressing this challenge.However, only a few frequentist methods have been proposed for survival models, and none of them focus on time-to-event data specifically.Here, we have provided an overview of a Bayesian modelbased approach to account for unmeasured confounding when supplementary information is available in the form of internal or external validation data.Subjectively elicited informative prior distributions are another alternative.A flexible parametric approximation of the Cox proportional hazard model from [14] was implemented to construct the posterior likelihood estimation incorporating both main study and supplemental information.Also, we have conducted simulation studies under different scenarios based on empirical data from a published comparative cardiovascular study.We found that even when the unmeasured confounder is only mildly related to the exposure, as long as the unmeasured confounding at least moderately impact the outcome, non-ignorable bias still results in the naïve model.Further, we demonstrated that our Bayesian method can correct for such bias using rather modest amounts of validation data.
There are several limitations of our methods that should be discussed to perhaps motivate further research.First we are assuming the unmeasured confounder is, in fact, measurable in either the current or a similar data set.If it is an unmeasurable latent trait, our methods will not apply.The use of external validation data requires several assumptions.First, the population that the validation data comes from is similar enough to the current data so that the parameters being estimated are approximately the same.Secondly, we require that the same variables be measured.Finally, we focus on relatively simple relationships with the unmeasured confounder.Indeed, we are assuming the information available (either validation data or expert opinion) will rarely be rich enough to model complex relationships.
In this paper we have focused on simulated data where the true value of the parameters is known and the extent of the bias due to the unmeasured confounding can be ascertained precisely.In future research we will provide a thorough analysis of real world data and provide further guidelines on how to address unmeasured confounding in time-to-event data.

Convergence diagnostics
Below are representative convergence plots for the simulation studies illustrating convergence.These are for 60% censoring for the informative prior case (no validation data).

Supplementary material e12634-8
Epidemiology Biostatistics and Public Health -2017, Volume 14, Number 4 A Bayesian approach to correct for unmeasured or semi-unmeasured confounding in survival data using multiple validation data sets

Additional Simulation Results
We now describe some results from additional simulations.The data parameters chosen were motivated by the results in Connors (1996).Note that there is only a very small relationship between the treatment and the unmeasured confounder.The exposure variable, x, and three measured covariates were all generated from binary distributions with probabilities 0.4, 0.08, 0.24 and 0.11 respectively.The model used for the unmeasured confounder for the simulations was u i ~ N (0.6-04x 1i ,0.2) .For the Cox model we used exp(0.3xi -1.2z 1i +0.4z 2 +z3-3u) We assumed a main study sample size of 1000 and validation sample size of 100.We generated an internal validation data set with all variables including the unmeasured confounder and an external validation data set where only summary statistics for the unmeasured confounder and exposure variable were obtained.We considered censoring rates of 30%, 60%, and 90%.We used fairly diffuse priors for all regression parameters, specifically, normal with mean 0 and standard deviation of 10.We used a diffuse inverse gamma prior for and diffuse gamma priors for the .The results in Supplemental Table 1 yield similar relationships as the results discussed in the main manuscript.Using validation data improves bias and coverage for lower censoring rates but only improves bias in higher censoring rate cases due to wide intervals for all methods.A Bayesian approach to correct for unmeasured or semi-unmeasured confounding in survival data using multiple validation data sets We now also consider the case of a binary unmeasured confounder.Specifically, we generate the data via the following: For the Cox model, we used exp(0.3xli -1z li +0.3z 2i +0.005z 2i -1.4u i ) Again, the main study size was 1000 with 100 subjects for the validation samples.The results for 30% and 90% censoring are provided in Supplemental Table 2.For the binary unmeasured confounder case the naïve model does worse than it did with a continuous unmeasured confounder with more bias and lower coverage.The internal and external validation models again reduce the bias and provide coverage at close to nominal but this is with the price of wider intervals.As in previous simulation examples, the case of 90% censoring has extremely wide intervals.

JAGS Code
The JAGS code for the model with a continuous unmeasured confounder and subject level validation data (either internal or external) is provided below.

TABLE 1 .
Summary statistics for unmeasured confounder across binary exposure variable

TABLE 2 .
Simulation results across 100 data sets with censoring rate of 60%.Empirical coverage is for 95% nominal intervals.

TABLE 3 .
Simulation results across 100 data sets with censoring rate of 90%.Empirical coverage is for 95% nominal intervals.
Footnote: Naïve model ignores unmeasured confounders in the analyses.

TABLE 2 .
Simulation results for β with binary unmeasured confounder.