Clinical research of kidney diseases V: extended analytic models
1 Divisione di Nefrologia e Dialisi, Azienda Instituti Ospitalieri di Cremona, Cremona, Italy 2 Clinical Epidemiology Unit 3 Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, Canada
Correspondence and offprint requests to: Pietro Ravani, Divisione di Nefrologia, Azienda Istituti Ospitalieri di Cremona, Italy, Largo priori 1, Cremona, 26100, Italy. E-mail: pietro.ravani{at}med.mun.ca
| Introduction |
|---|
|
|
|---|
In some study designs the same epidaemiological unit is observed more than once. For example, in a cross-sectional study of the radial artery flow rate, several outcome values can be recorded on the same subject under different experimental conditions (e.g. exposure to different vasoactive substances). Some longitudinal studies typically monitor participants over time and both predictors (e.g. blood pressure) and outcomes (e.g. left ventricular mass index) are measured on different occasions in the same subject. In other designs, observations can fall into groups (clustered data), such as single measurements taken on a paired organ (e.g. the eye or the kidney) or single observations on different members of the same hospital/region or family. More complex designs may lead to a combination of clustering and repeated/longitudinal measurements (Table 1). All these designs generate correlated outcome data (Figure 1).
|
|
Multiple measurements on the same subjects are correlated because their values tend to be closer to each other than those obtained from different individuals. Similarly, single assessments of paired organs or members of the same hospital/region or family are correlated because different organs of the same subject and different individuals of the same community share biologic experiences, environmental exposures and genetic background. For some outcomes such as disease recurrence, previous experience may induce negative correlation. In all cases, once a measurement value has been obtained further values within the same individual/cluster can be more accurately guessed and the corresponding measurement errors are no longer due to chance alone. In these situations traditional regression methods are inappropriate as they assume independent errors [1,2]. Two major analytical approaches exist for the analysis of correlated generalized linear (continuous, binary and count outcomes) and time-to-event data: random effect modelling and variance-corrected methods. The main assumption of these approaches is that the responses are correlated within cluster/subject, but independent between clusters/subjects.
| Extended generalized linear models |
|---|
|
|
|---|
Fixed and random effects modelling
Schrier et al. studied the effect of rigorous (120/80 mmHg) versus standard (135–140/85–90 mmHg) blood pressure (BP) control on left ventricular mass index (LVMI), in hypertensive subjects with the autosomal-dominant polycystic kidney disease (ADPKD) and left ventricular hypertrophy [3]. Echocardiograms were performed at baseline, at 1 and 7 years. Two analytical approaches are appropriate for this type of data: a standard linear ANCOVA model of the difference in LVMI at the study end by BP target, taking into account baseline levels (plus other inputs) and disregarding intermediate measurements; or a model for correlated outcome data, including longitudinal data. The two models answer the following questions: What is the difference in LVMI after 7 years of rigorous treatment as compared to standard treatment? and What is the (yearly) change in LVMI of rigorous and standard treatments as compared to baseline values?. A mixed model for longitudinal data analysis was chosen, revealing that rigorous BP control was significantly more effective in reducing LVMI. Unfortunately the paper fails to report how important biases were prevented/controlled (sampling, measurement, information, performance) and issues such as sample size estimation, model checking and coefficient estimates. For example, the reader would be interested in the effect of the assignment group (i.e. difference in the intercept at baseline); of standard treatment (slope); of rigorous intervention (difference in slope); of other covariates and interaction terms (possibly impacting the intercept and/or slope). In addition to these effects, generalized mixed models (GMM) estimate other parameters called random effects. These effects account for the variation in the response that the predictors of interest fail to explain (Figure 2). GMM include models not only for normally distributed responses as in this example (identity link function), but also for correlated responses with binomial (logit link) or Poisson (log link) distribution [1,2].
|
To understand the philosophy of this approach, it is useful to look at the study outcome variability as a mixture of different components [4]. The regression coefficients of the model covariates estimate the explained variability of the response (systematic component). These are called fixed effects, because they are associated with fixed factors (or continuous inputs) whose levels of interest are actually measured or measurable. Fixed effects are unknown constant population parameters (e.g. the true effect of BP target on left ventricular mass in ADPKD people with hypertension and LV hypertrophy). The levels of interest of the fixed covariates are known or chosen by design (e.g. gender or exposure levels, or categories of a continuous covariate such as BP targets). However, other covariates are often measured in some studies. They are called random classification variables because their levels can be thought of as being randomly sampled from a population of levels, such as individuals A, B, C in repeated/longitudinal designs, Drs A, B, C, or hospitals A, B, C and so on in clustered studies. All possible levels of these random factors are not present in a single study, but researchers still intend to make inferences about the entire population of levels. In the BP trial, study participants are random factors. To distinguish between random and fixed factors, it is useful to answer the following question: Were the study to be repeated would the same groups/levels be used again? If yes (e.g. gender, treatment A v B, age groups), this implies fixed effects. If not (e.g. centres, regions, subjects), it implies random effects. However, the same variable may be treated as a fixed factor in some studies, and as random variable in others, depending on the study question (e.g. health policy effects).
A model containing a random effect adds another layer to its random part: one is the variation explained by the random factors; the other is what remains unexplained by the combination of fixed and random factors. Random effects are unobserved random changes of the response by levels of the random factors or deviations from the relationship described by the fixed factors. For example, suppose that an outcome such as peripheral blood flow be measured twice in the same subject before an experiment is undertaken (Figure 3). Response values in the same subject tend to be closer than values obtained from different individuals. Consequently, two error components exist rather than one. One is due to subject (between subject variation) and it is the random effect shared within individual but varying across them (heterogeneity). The other is due to the measurement occasion nested within subject (within subject variation). This variability due to measurement can be estimated when more than one measurement is performed in the same subject, although it exists independent of the number of measurements performed. A random effect model estimates both these variance components. When the variance of the random effect is significantly different from zero, the null hypothesis of the absence of correlation is rejected. The proportion of the total variance due to subject estimates the correlation in the data and the accuracy of the measurement tool (Figure 3).
|
Variance correction
Freedman et al. tested the association between log-coronary calcification score and log-albumin:creatinine ratio in 588 white participants with type 2 diabetes from 325 families [5].
Generalized estimating equations with exchangeable correlation and the sandwich estimator of the variance were used to model the association of interest controlling for the effects of extraneous variables and the correlation in the data due to the presence of familial clusters. They found that the adjusted log-calcification score was 0.1716 higher (standard error 0.0592, P = 0.0037) per unit increase of log-albumin:creatinine ratio (going back to the original scales, this means that the expected calcification score was equal to the intercept times albumin:creatinine ratio to the power of 0.1716).
Generalized estimating equations (GEE) represent another possible approach to the analysis of correlated data. This method corrects the model variance (i.e. the random part of the model) for the dependences in the data. To put it simply, the way the correlation in the data has ensued is taken into account for the estimation of the parameters (effects) and their standard errors. GEE have the same structure as standard regression models, i.e. a systematic component and a random component without specification of any additional random layer (Table 2). As generalized linear models [2] and GMM, GEE require the specification of one of the link functions from the generalized linear family (identity, logit, log). However, their estimation method requires the additional specification of a working correlation for the observed responses (Table 3). For example, Freedman et al. used an exchangeable correlation in their model [5]. This means that the standard errors of the GEE coefficients were corrected assuming that one single correlation coefficient (parameter) would describe the association of pairs of different responses (subjects) from the same cluster (family).
|
|
Which structure best describes the real data correlation is not always obvious, although the research design may help decide. However, GEE analysis requires only a rough estimate of this structure to get started. The final parameter estimates (fixed effects, their standard errors and the correlation
coefficients) are not usually dependent on the accuracy of the initial assumptions about the correlation matrix. In fact they are consistent (i.e. converge to the true value) as the number of clusters/subjects increases even if the initial structure is incorrectly specified, unless the fraction of missing data is large or they are not missing at random. Although the correlation structure is not necessarily the same for all clusters/subjects, GEE assume one set of
parameters common to all clusters/subjects to avoid estimating too many parameters. Given the importance of the chosen correlation structure and the possibility of misspecification in real-life situations, a special method called sandwich estimator is used to estimate the standard errors of the coefficients in these models. This method corrects the variance incorporating the dependences in the process of computations by removing one cluster at a time, and providing an honest estimate for correlated data whenever the observations left out at any step are independent of the observations left in. The standard errors of the coefficients are usually (but not always) larger, depending on the sign of the correlation in the data. Put simply, the statistical testing is more conservative (the confidence intervals larger) as compared to the corresponding generalized linear model applied to the same data as though each observation was independent (independent correlation structure). This empirical method is called robust because the variance estimation is consistent, even if the chosen correlation structure is incorrect (robustness to misspecifications) [6–8].
Model choice
The choice of the analytical tool for correlated generalized linear data can be guided by different considerations. As opposed to GMM, GEE are based on only one level of clustering, are not designed for inferences about the covariance structure (the working correlation structure is formulated with no distributional assumptions) and do not give predicted response values for each cluster. Using GMM involves making extra assumptions, but gives more efficient estimates, and allows estimating contributions to variability from different sources, including multilevel correlations.
Finally, GEE are marginal models as they assume a model holding over all clusters (population average). Therefore, the coefficients represent the average change in the response over the entire population for a unit change in the predictor. GMM are conditional models in that they assume a model specific to each cluster/subject. Therefore, the coefficients represent the average change in the response for each cluster/individual, given a unit change in the predictor. Although population effects can be derived averaging cluster effects, conditional models are most useful when the objective is to make inferences about clusters/individuals rather than the population.
| Extended survival models |
|---|
|
|
|---|
Correlated survival times
Correlation in the occurrence and timing of repeated events may occur when individuals experiencing a single event belong to groups or clusters, or where the subject experiences some event more than once due to a recurrent event process [9]. The correlation in the survival times may result from differences in the general tendency to fail across individuals and varying tendency to fail further once the recurrence process has started (Figure 4). Heterogeneity across subjects (unshared frailty) may be due to unknown, unmeasured or unmeasurable effects (different lifestyles, genetic traits, environmental factors and experiences), which influence the likelihood to succumb to disease. As a result, some individuals are more (and others less) prone to disease, experiencing their first, second, third, etc., recurrent episode more (less) quickly than others. Event dependence within a subject emerges when the threshold for further events changes once previous events have occurred (e.g. the baseline risk of thrombosis of the second and third bypass graft is progressively higher or lower than that of the first). Further events become more or less likely according to whether the process induces a biological weakening or strengthening of the organism and whether the subject is more or less frail (shared frailty). In either case the risk for an event is a function of previous occurrences. Medical research and clinical experience suggest that both individual unshared tendencies and varying shared susceptibility to fail during the recurrent process are likely to be the rule, rather than the exception, in the study of multiple events, and that each may enhance the effect of the other [9].
|
This correlation among events violates the assumption that the timing of events is independent and has two important consequences: the estimates of the coefficients and their standard errors are both biased (wrong) and inefficient (imprecise) in typical repeated events contexts. Variations of the Cox model (and other survival models), namely frailty (or random effect) models and variance-corrected methods, have been proposed to account for the correlation among event times.
Risk sets for survival analysis
Data layouts for survival analysis are complex as they define the risk set based on the three components of the response variable (time start, time stop and censor status), and possible distinction of different basal risk categories [2]. For an appropriate definition of the risk set different aspects should be considered [9]: classification of type and order of the failure events (whether the events are of different or the same type, whether they occur with or without natural order); definition of the time at risk (when the risk starts and ends); consideration of the mechanisms through which the predictor is involved in the process (whether/how the same predictor affects more outcomes) and definition of what is being modelled (the time to each event, the total course of a recurrent process or the time segments to each recurring event).
For unordered events of the same type (such as lesions of the eye [10]) and of different type (such as uraemia and mortality in a follow-up study of chronic kidney disease patients [11]) a risk set called marginal has been suggested (stratified for event of different type to allow different basal risks). For ordered events (such as catheter infections or dysfunctions, repeated peritonitis or transplant rejection episodes) four options have been proposed. Table 4 reports key characteristics of these risk sets.
|
Variance-corrected models
Variance-corrected models represent one way to deal with the problems produced by heterogeneity across individuals and failure-time dependences. Variations within the family of variance-corrected models are based on different definitions of the risk sets including whether they allow for event-specific baseline hazards using stratification. In these models (marginal, counting process and conditional risk sets) the robust (cluster) variance estimator is used as in GEE analysis. Variance-corrected models do not incorporate any random effect into the estimates themselves.
Frailty models
In contrast to the variance-corrected models, frailty models do incorporate the heterogeneity between clusters/subjects into the estimated portion of the model by making assumptions about its distribution [16–19]. This latent random effect varies across individuals but is assumed to be constant over time and shared by a single individual (or all members of a cluster). As a result, under frailty models the event times are assumed to be independent conditional on the patient's underlying frailty and inference can be made in the standard fashion. Frailty models estimate the variance of this random effect. When this variance is significantly different from zero, the model supports the hypothesis of a significant heterogeneity in the data.
The risk set of the standard (unconditional) frailty model is the same as the conditional risk set from entry (Table 4), but without stratification. This has been viewed as a limitation in the presence of event dependence, which is controlled instead by stratified variance-corrected methods and therefore these may be preferred in the presence of event dependence without heterogeneity. Since repeated events processes are usually characterized by both event dependence and heterogeneity (or it is often unclear which feature of the data mostly underlies the correlation), a stratified (conditional) frailty model has recently been proposed with the same risk set as the gap time risk set [17].
Model choice
The choice of the analytical tool for correlated survival data is dictated by the type and order of the failure events and the clinical question to be answered (Table 4).
For multiple events of different type (e.g. the same person may be observed to develop kidney failure, a stroke and then die) the variance-corrected marginal model is often a good choice. This is true when the model includes factors plausibly involved in the mechanism of more than one event type (e.g. hypertension). Frailty models can be used to specify and account for the sources of correlation in the data [16–19].
For ordered recurrent events of the same type (e.g. episodes of thrombosis in a vascular access) there are more choices, though most often the order condition and the difference in the baseline risks are important issues to be accounted for. The counting process is useful if there is no reason to believe that the baseline risk varies, as it is not stratified. The marginal risk model may be more appropriate to model repeated hospitalizations (where the reason for hospitalization has no natural order) than repeated bypass graft thrombosis or peritonitis episodes. Conversely, when the clinical course of repeated events supports the conditional assumption, one can either model the entire time course of the disease (from entry) or model the time segments between failures (from previous event). However, variance-corrected methods may still provide biased results in the presence of heterogeneity since they do not incorporate any random effect in the model.
Heterogeneity and event dependence can be considered components of a latent random effect inducing biased estimates if not taken into account. Both sources of correlation in the data may simultaneously underlie most of the recurrent event processes, although one may prevail over the other. In the presence of event dependence without heterogeneity the true variance of the frailty is zero. In these cases stratified variance-corrected methods perform well, whereas the traditional (un-stratified) frailty model may detect the presence of a random effect that is probably the consequence of event dependence rather than heterogeneity. In the presence of heterogeneity without event dependence, stratification may not be necessary since the baseline risk should not change by event number. In this case variance-corrected models may be inefficient and the unconditional frailty model would perform better. Yet, since repeated events data are very likely to exhibit both sources of correlation, a modelling strategy that is robust to heterogeneity and event dependence may be necessary [17].
Time-dependent effects and time-varying covariates
Another issue to consider when defining a risk set is related to the values and effects of the input variables. The term time dependent is more appropriately used to define the effect associated with an input and the term time varying is used for a covariate with updated values over time. For example, an input variable measured at baseline (e.g. recent myocardial infarction) can have different effects during different follow-up periods that can be modelled as a step-function of time (e.g. the relative risk for death from the infarct declines over time). In a follow-up study, baseline values of renal function were associated with increased risk for death only during the first year of observation and not thereafter [11]. These estimated time-dependent effects must satisfy the proportionality assumption when using the Cox's model. Conversely a variable measured only once (at baseline) may interact with time and thus have an effect that changes with time, as was found for serum albumin in the HEMO study [20]. By definition, this effect will not satisfy the proportionality assumption. Another possibility is that the risk set contains updated values of a variable. For example, in a study of Urotensin II (a vasoactive substance) in chronic kidney disease patients, end-stage renal disease status (not yet on dialysis versus already on dialysis) had a different effect on cardiovascular events [21]. This input variable was treated as a time-varying covariate as subjects could change their status during follow-up. These input specific effects must also satisfy the proportionality assumption.
| Special topics |
|---|
|
|
|---|
Random effect modelling and variance-corrected methods are general approaches to model quantitative responses, categorical data, counts and survival times. The advantage of these methods is that they are natural and very flexible extensions of standard techniques, easily applicable to different circumstances. Special methods exist for specific analytical issues. Examples are repeated measures ANOVA for continuous responses and categorical exposures [22], and multivariate ANOVA (MANOVA) to study the simultaneous change of more quantitative outcomes in response to an exposure [23]. Several other models for longitudinal designs are available, such as Markov Chains models to study the probability of a state change in a population [24], and time series to study observations at successive time intervals [25].
| Conclusion |
|---|
|
|
|---|
Clinical researchers are interested in describing how the study outcome varies in response to the effect of some exposure of interest (i.e. a therapy, a diagnostic test, a risk factor, etc.). In general, study design and sample size estimation, study conduct and statistical analysis should be consistent with the research hypothesis and objective. Examples of discrepancies among these domains are not uncommon in the medical literature. Often the correct approach to the assessment of the outcome variability requires multiple measurements of either or both predictors and outcome. This should be considered in the formulation of the study question and the design phase, including the choice of the correct analytical strategy to assess the correlation in the data and the role of its sources.
| Acknowledgments |
|---|
P.R. held a young investigator award from the Italian Society of Nephrology for the year 2005–2006 and received funding from the EU (Marie Curie Actions-OIF, proposal 021676) for the year 2006–2007.
Conflict of interest statement. None declared.
| References |
|---|
|
|
|---|
- Ravani P, Parfrey P, Gadag V, et al. Clinical research of kidney diseases III: principles of regression and modelling. Nephrol Dial Transplant (2007) 22:3422–3430.
[Free Full Text] - Ravani P, Parfrey P, Murphy S, et al. Clinical research of kidney diseases IV: standard regression models. Nephrol Dial Transplant (2008) 23:475–482. NDT Advance Access published on January 8, 2008.
[Free Full Text] - Schrier R, McFann K, Johnson A, et al. Cardiac and renal effects of standard versus rigorous blood pressure control in autosomal-dominant polycystic kidney disease: results of a seven-year prospective randomized study. J Am Soc Nephrol (2002) 13:1733–1739.
[Abstract/Free Full Text] - West B, Welch K, Galecki AT. Linear mixed models: an overview. In: Linear Mixed Models: A Practical Guide Using Statistical Software (2007) New York: Chapman & Hall/CRC. 9–49.
- Freedman BI, Langefeld CD, Lohman KK, et al. Relationship between albuminuria and cardiovascular disease in Type 2 diabetes. J Am Soc Nephrol (2005) 16:2156–2161.
[Abstract/Free Full Text] - Lin DY, Wei LJ. The robust inference for the Cox proportional hazards model. J Am Stat Assoc (1989) 84:1074–1078.[CrossRef][ISI]
- White H. Maximum likelihood estimation of misspecifed models. Econometrica (1982) 50:1–25.[CrossRef][ISI]
- Zeger SL, Liang K-Y. Longitudinal data analysis for discrete and continuous outcomes. Biometrics (1986) 42:121–130.[CrossRef][ISI][Medline]
- Therneau TM, Grambisch PM. Multiple events per subject and frailty models. In: Modeling Survival Data: Extending the Cox Model (2000) New York: Springer. 159–260.
- Lee EW, Wei LJ, Amato D. Cox-type regression analysis for large number of small groups of correlated failure time observations. In: Survival Analysis, State of the Art (1992) The Netherlands: Kluwer. 237–247.
- Ravani P, Tripepi G, Malberti F, et al. Asymmetrical dimethylarginine predicts progression to dialysis and death in patients with chronic kidney disease: a competing risks modeling approach. J Am Soc Nephrol (2005) 16:2449–2455.
[Abstract/Free Full Text] - Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc (1989) 84:1065–1073.[CrossRef][ISI]
- Lunn M, McNeil D. Applying Cox regression to competing risks. Biometrics (1995) 51:524–532.[CrossRef][ISI][Medline]
- Andersen PK, Gill RD. Cox's regression model for counting processes: a large sample study. Ann Stat (1982) 10:1100–1120.[CrossRef]
- Prentice RL, Williams BJ, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika (1981) 68:373–379.
[Abstract/Free Full Text] - Huang X, Wolfe RA. A frailty model for informative censoring. Biometrics (2002) 58:510–520.[CrossRef][ISI][Medline]
- Box-Steffensmeier JM, De Boef S. Repeated events survival models: the conditional frailty model. Stat Med (2006) 25:3518–3533.[CrossRef][ISI][Medline]
- Liu L, Wolfe RA, Huang X. Shared frailty models for recurrent events and a terminal event. Biometrics (2004) 60:747–756.[CrossRef][ISI][Medline]
- Mahe C, Chevret S. Analysis of recurrent failure times data: should the baseline hazard be stratified? Stat Med (2001) 20:3807–3815.[CrossRef][ISI][Medline]
- Eknoyan G, Beck GJ, Cheung AK, et al. Hemodialysis (HEMO) Study Group. Effect of dialysis dose and membrane flux in maintenance hemodialysis. N Engl J Med (2002) 347:2010–2019.
[Abstract/Free Full Text] - Ravani P, Tripepi G, Pecchini P, et al. Urotensin II is an inverse predictor of death and fatal cardiovascular events in chronic kidney disease. Kidney Int (2008) 73:95–101.[CrossRef][ISI][Medline]
- Dittrich E, Puttinger H, Schillinger M, et al. Effect of radio contrast media on residual renal function in peritoneal dialysis patients—a prospective study. Nephrol Dial Transplant (2006) 21:1334–1339.
[Abstract/Free Full Text] - van Vilsteren MC, de Greef MH, Huisman RM. The effects of a low-to-moderate intensity pre-conditioning exercise programme linked with exercise counselling for sedentary haemodialysis patients in The Netherlands: results of a randomized clinical trial. Nephrol Dial Transplant (2005) 20:141–146.
[Abstract/Free Full Text] - Weijnen TJ, van Hamersvelt HW, Just PM, et al. Economic impact of extended time on peritoneal dialysis as a result of using polyglucose: the application of a Markov chain model to forecast changes in the development of the ESRD programme over time. Nephrol Dial Transplant (2003) 18:390–396.
[Abstract/Free Full Text] - Espinosa M, Martn-Malo A, Ojeda R, et al. Marked reduction in the prevalence of hepatitis C virus infection in hemodialysis patients: causes and consequences. Am J Kidney Dis (2004) 43:685–689.[CrossRef][ISI][Medline]
Accepted in revised form: 20. 2.08
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



ij). This includes two components: the variability due to subject j (random effect
j equal to the difference between the subject mean (µj) and LP) and the variability due to measurement on occasion i (effect of occasion
ij nested in subject equal to the difference between µj and each response measured on j, yij). Usually it is assumed that both these components are normally distributed (
N) with mean zero and some non-zero variance (
and
). The intra-class correlation coefficient 
0(t), i.e. baseline risk across subjects) and within subject dependence of the failure events (