Skip Navigation


NDT Advance Access originally published online on November 9, 2005
Nephrology Dialysis Transplantation 2006 21(3):743-748; doi:10.1093/ndt/gfi255
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
21/3/743    most recent
gfi255v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Tangri, N.
Right arrow Articles by Naimark, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tangri, N.
Right arrow Articles by Naimark, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author [2005]. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org


Original Articles: Dialysis and Transplantation

Lack of a centre effect in UK renal units: application of an artificial neural network model

Navdeep Tangri1, David Ansell2 and David Naimark3

1 Department of Internal Medicine, McGill University, Montreal QC, 3 Department of Nephrology, University of Toronto, Sunnybrook and Women's College Hospital, Toronto, ON, Canada and 2 United Kingdom Renal Registry, Bristol, UK

Correspondence and offprint requests to: Dr Naveep Tangri, Department of Internal Medicine, McGill University, Montreal QC, Canada. Email: ntangri{at}yahoo.com



   Abstract
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Background. Dialysis centre effect has been suggested to influence survival in end-stage renal disease (ESRD) patients. Few studies over the past decade have commented on the existence of the centre effect using logistic regression models.

Methods. We used high quality prospectively collected data from the UK Renal Registry (UKRR) and created an artificial neural network model to predict mortality within 1 year in this cohort. We used a multitude of demographic variables including co-morbodities as well as relevant laboratory data to create a prognostic model.

Results. A highly efficient model for predicting 1 year mortality was created after restricting the model to use demographic and case-enriched data [area under the receiver operating characteristic curve (AUROC) = 0.974]. The addition of the dialysis centre code and centre size as input variables did not add to the efficiency of the model (AUROC = 0.962). Moreover, dialysis centre code or size alone was not predictive of mortality when applied to an artificial neuronal network architecture (AUROC = 0.649 and 0.628).

Conclusion. Residual effects in previous studies may have been due to the non-linear nature of the data and complex intervariable relationships. Centre size and other centre-related factors have no impact on survival on ESRD.

Keywords: artificial neuronal network; centre effect; centre size; dialysis centres; patient survival; renal failure



   Introduction
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
End-stage renal disease (ESRD) remains a significant and growing health burden across the world. Mortality rates from ESRD have remained high despite advances in dialysis technology over the past three decades as well as aggressive cardiovascular risk reduction. Previous studies [1] have suggested that approximately one in eight patients with ESRD die within the first 3 months after starting renal replacement therapy (RRT). Many factors have been shown to be associated with the mortality rate in ESRD patients. Usually, investigators have used regression methods in order to determine the magnitude of the association between putative predictive factors and mortality risk [2,3].

One factor which has shown a consistent association with mortality across studies is the so-called ‘centre effect’. The latter is conventionally defined as the residual difference in mortality probability that exists between renal centres after adjustment for other risk factors in a regression model [2]. Determining the veracity of the centre effect is important; if it is a real and significant contributor to mortality risk, then units with particularly high or low mortality rates after statistical adjustment could be compared in order to identify process differences. On the other hand, if the effect vanishes after appropriate statistical adjustment, then the effort required for the scrutiny of these renal centres could be saved.

Logistic regression models, however, do have limitations; they do not fit complex, non-linear relationships very well and can only consider pre-specified interactions between factors [4]. This limitation may be particularly problematic because of the complex interaction of patient-specific and treatment-related factors in ESRD patients [5]. This raises the question of whether the centre effect observed in these logistic regression analyses is, in fact, real or merely a statistical artefact created by the failure of the models to fit complex data adequately.

In order to address this question, we have applied a relatively new statistical methodology, the artificial neural network (ANN), to the data from a large, high-quality, renal registry: the UK Renal Registry (UKRR). ANNs have the advantage of being able to detect complex, non-linear relationships between ‘inputs’ to the network (i.e. the putative risk factors) and the ‘output’ of the network (i.e. patient survival), and can simultaneously consider all possible interactions between those risk factors. The UKRR contains validated, prospective data from a substantial fraction of the ESRD population in the UK [4]. Application of this new methodology to this quality data source offers a unique opportunity to test the veracity of the centre effect.



   Subjects and methods
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Data were obtained from the UKRR for the year 2000. The data contained within the UKRR were derived from renal units servicing 51% of the UK adult population. The UKRR receives prospective patient information electronically from 28 out of 63 renal centres in England and Wales servicing a population base of 31.4 million people. These 28 centres consisted of 92 hub and satellite renal units. Laboratory data were obtained at quarterly intervals in 2000. Data arriving at the UKRR are subjected to algorithms which identify suspicious or incongruous values, and these outliers are verified by contacting the renal centre and correcting them if necessary [6].

For the present study, data were abstracted on the 18 015 prevalent ESRD patients who had laboratory results present in the UKRR for the first quarter of 2000. Demographic and clinical input variables included: age, sex, average pre-treatment systolic blood pressure (SBP), diastolic blood pressure (DBP) and weight for haemodialysis patients, and clinic SBP, DBP and weight for peritoneal dialysis patients, dialysis modality and the presence or absence of vasculopathy, diabetes, neoplasia, lung disease, smoking, liver disease and erythropoietin use. Laboratory input variables consisted of those measured in the first quarter of 2000 and included: haemoglobin (Hb), ferritin, albumin, HbA1c, total cholesterol, Na, Ca, phosphate, parathyroid hormone (PTH), bicarbonate, creatinine, urea and the urea reduction ratio (Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. Laboratory and demographic variables in the artificial neural network

 
The outcome event to be predicted by the ANN was mortality occurring between January 1, 2000 and December 31, 2000. Cases were defined as patients who had a date of death between January 1, 2000 and December 31, 2000, while controls were defined as patients with a date of death occurring after December 31, 2000. Clinical and laboratory values were compared between cases and controls by means of t-tests and {chi}2 tests for continuous and categorical variables, respectively. Event rate per centre was defined as the number of mortalities divided by the number of patients at that centre at the beginning of the study period.

Two logistic regression models were constructed using SPSS v 13.0 (Chicago, IL): the first with all 43 predictor variables (including centre size) and the intercept (without interactions), and a second model that included only the intercept. Twice the difference of the negative log likelihoods for the two models was compared with a {chi}2 distribution with 42 degrees of freedom in order to determine the significance of the residual error after fitting demographic, clinical and laboratory variables.

In an effort to improve the ANN's ability to detect cases, the data set was enriched to an ~50:50 mix of cases and controls by replicating the data from cases ~8-fold (Figure 1). This manoeuvre would be expected to decrease the generalizability of the resulting model, but enhance its ability to detect relationships between the predictor factors and the outcome. Individuals with missing demographic data were removed while missing laboratory data was imputed by replacing blanks with the mean value across subjects for continuous variables and with the proportion of positive values for binary variables. In order to enhance ANN training, by eliminating inputs with the value ‘0’, all input factors were transformed to values between 1 and 2 using the equation x' = [(x – min(x))/(max(x) – min(x))] + 1 where min(x) and max(x) are the minimum and maximum of the input variable ‘x’ across all cases.


Figure 1
View larger version (24K):
[in this window]
[in a new window]
 
Fig. 1. Method of data extraction for the ANN predictive model.

 
A random 85% subset of the data was used to train a multilayer, perceptron ANN with a 38-76-76-1 nodal architecture using backpropagation with Neuroshell 2 v 3.0 (Ward Systems Group, Frederick, MD). The training of the ANN was stopped when the average difference between the known outcome of the training cases (1 for event and 0 for no event) and the predicted outcomes from the ANN (numbers between 0 and 1) converged to a pre-set minimum.

The trained ANN was then used to make predictions on a validation set consisting of the remaining 15% of cases in the data set. The accuracy of the predictions was assessed by the area under the receiver operating characteristic curve (AUROC). An AUROC of 1.0 implies perfect discrimination between cases and controls in the validation set, while a value of 0.5 indicates no predictive ability. The AUROC and SE (AUROC) and comparisons between ROC curves were computed using the maximum likelihood, semi-parametric method of Metz with Rockit v. 0.9.1 [7]. In order to assess the relative strength of the association between 1 year mortality and a factor or factors, we trained two ANNs: one which included the factor or factors of interest and another without them. A statistically significant drop in the predictive performance of the ANN with the omission of a factor was taken as evidence of a significant independent association between the putative factor and mortality risk.

A bootstrap analysis was then performed to generate 20 ANNs from repeated random 85% subsets of the data in order to assess the robustness of the AUROC values for the ANN predictions. All significance values for the differences between pairs of ROC curves were estimated with another semi-parametric method of Metz using Clabroc Version 1.2.1 [8].



   Results
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Patient characteristics
Baseline demographic and laboratory characteristics of the patients are reported in Tables 2 and 3, and treatment variables in Table 4. Cases were on average 13 years older and carried a significantly larger burden of co-morbid conditions than the controls. There were no clinically significant differences in erythropoietin use. However, transplant patients comprised >60% of the cases and only 33% of the controls. Cases had significantly lower albumin and Hb values and urea reduction ratios at baseline compared with controls.


View this table:
[in this window]
[in a new window]
 
Table 2. Demographic and clinical characteristics of prevalent patients in the UKRR January 1, 2000

 

View this table:
[in this window]
[in a new window]
 
Table 3. Laboratory variables of prevalent patients in the UKRR January 1, 2000

 

View this table:
[in this window]
[in a new window]
 
Table 4. Treatment variables of prevalent patients in the UKRR January 1, 2000

 
Renal units including hubs and satellites were assigned unique codes for anonymity. There were 92 renal unit codes in the prevalent data. The unit size, i.e. the number of patients per unit code, ranged from one to 1262 with a mean of 195, median of 68 and interquartile range of 230. Individual centre codes were added in the analysis for centres with >200 patients and all remaining centres were grouped together. Centre size was also added as an independent variable.

Overall event rate in our entire cohort was 10.4% or 1705 out of 16 383. There was a substantial variation in the event rate within the different centres, ranging from 0 to 50% (Figure 2). Twice the negative log likelihood of a logistic model with all 42 demographic, clinical and laboratory variables and the intercept included was 507.4, while that for a model with only the intercept was 703.2 (P<0.0001). This implies that there was a significant, residual, unexplained error after all of the independent, predictor variables were included in a logistic regression model.


Figure 2
View larger version (18K):
[in this window]
[in a new window]
 
Fig. 2. Event rates as a percentage of cases divided by total patients per renal unit including hubs and satellites.

 
ANN results
The ANN training set consisted of 1386 patients of which 694 were cases, while the test set contained 244 patients of which 126 were cases. The ANN performed very well when laboratory and demographic data were used in the enriched data set. The mean AUROC across the 20 bootstrap samples for the model with all variables was 0.972 with an SD of 0.010 (Table 5).


View this table:
[in this window]
[in a new window]
 
Table 5. AUROC and P-values for 20 ANNs created using a bootstrap analysis

 
Performance was significantly affected when either laboratory or demographic data were removed, but remained unaffected by addition or removal of centre code and size (Figure 3). The model generated the highest AUROC when using all available laboratory and demographic data and excluding centre variables. Laboratory or demographic data alone yielded a lower AUROC than both variables combined together. Centre-related variables were very poor sole predictors of outcome as evidenced by the low AUROC (0.628 and 0.649). We observed no significant difference in the predictive efficiency of the ANN whether centre size was included in or omitted from the model (AUROCs 0.962 and 0.974, respectively). Furthermore, lack of a significant association between centre size and mortality was evident because of the relatively poor performance of the model when centre size was the only input (AUROC = 0.628).


Figure 3
View larger version (21K):
[in this window]
[in a new window]
 
Fig. 3. AUROC curves for various predictive models incorporating different input variables.

 


   Discussion
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
We found, using a statistical methodology which has been shown to be effective in complex pattern recognition tasks applied to a high-quality, prospective and verified data set, that the renal centre characteristics show little association with mortality in prevalent ESRD patients. This finding contrasts with the results of previous studies in which the centre effect has been implicated as the explanation for the residual mortality difference among centres after adjustment for case mix. For example, in Scotland, Khan et al. [2] hypothesized a centre effect was responsible for the mortality difference between centres [relative risk (RR) = 0.40, P<0.001]. Marcelli et al. [3] in a similar study found that European and US ESRD patients have mortality differences following case mix adjustment and they attributed it to the centre effect. Both studies used logistic regression models and attributed residual unexplained differences to the centre effect.

Centre size had also been suggested previously as a possible contributor to the centre effect [9]. Schaubel et al. [10] compared mortality rates from peritoneal dialysis from different centres in the Canadian Organ Replacement Register. Using Poisson regression and after covariate adjustment, they found that mortality was lower in centres with >500 patients compared with those with <99 (RR = 0.83, P<0.05). Our analysis contrasts with these findings as well. We did not observe a significant association between centre size and mortality.

In order to maximize our ability to detect differences in the predictive performance of our ANN model after the omission of one or more factors, we required that the ‘baseline’ ANN, which contained all of the factors, and which served as the basis for comparison, should perform extremely well. In order to achieve this, we created a somewhat artificial analytical data set: cases were presented to the ANN disproportionately, subjects with missing data were removed and missing data were imputed. Given these manipulations, we emphasize that our model should not be generalized as an outcome predictor in other populations of dialysis patients. It is unlikely that applying one of our trained ANN models to a ‘natural’ population of ESRD patients would produce AUROC values as high as those observed in the present analysis.

One concern regarding our analysis is the admixture of both dialysis and renal transplant patients. There is significant heterogeneity in the mortality rates of these two groups, i.e. the former have a significantly higher rate than the latter. Although this analysis includes both types of renal replacement patients, modality was included as a factor in the ANN model and the resulting high AUROC value was robust over 20 bootstrap samples. An additional concern may be the apparently low rate of utilization of erythropoietin in the study population. This low rate is artefactual since some centres have incomplete data returns due to non-electronic prescription of erythropoietin from within their renal information system. However, the predictive performance of the base-case ANN was quite good despite these incomplete data. Furthermore, the admixture and apparent low erythropoietin usage issues were present to the same degree in both the ANN models with and without centre code and centre size information. Since it was the difference in the predictive performance of these models that was the primary outcome of this analysis, these deficiencies in the data were essentially cancelled out.

Our analysis has limitations: we were not able to account for patients who changed centres throughout the year and incorporate these status changes into either the calculation of event rates or the ANN analyses. Changes of dialysis modality were likewise not considered in the ANN analysis. In addition, our data were obtained from prevalent ESRD patients. The fact that our sample contained a mix of patients with a variable length of time on dialysis, and thus did not represent an incident cohort, would bias our analysis if our aim had been to create a valid clinical prediction tool. However, our goal was to detect the effect of dialysis centre. In this case, variation in the length of time on dialysis, and consequently of subsequent mortality risk, is actually desirable in order to facilitate the detection of associations. Likewise, it was advantageous for our analysis that event rates among centres varied widely. Finally, ANNs have the disadvantage of being unable to produce model parameter estimates that have the same intuitive meaning as the coefficients from a logistic regression analysis (i.e. the log odds ratio).

The strengths of our approach are that we applied the sophisticated technology of ANNs to a high-quality prospectively collected and validated data set. Neural networks have been shown to have significant strengths in medical and non-medical prognostication [11,12]. We were also able to select and analyse our data in such ways as to create maximum sensitivity for detection of the effect of putative predictor variables. The addition of centre codes and centre size did not make our model more efficient. We speculate that the previous unexplained ‘centre effect’ found by other investigators may have been an artefact due to the tendency of logistic regression models to ‘under-fit’ complex, non-linear scenarios. Further research is needed, perhaps with other pattern recognition approaches in similar prospectively collected data, to confirm the lack of the centre effect in other patient populations.

Conflict of interest statement. None declared.



   References
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 

  1. Metcalfe W, Khan IH, Prescott GJ, Simpson K, MacLeod AM. Can we improve early mortality in patients receiving renal replacement therapy? Kidney Int 2000; 57: 2539–2545[CrossRef][Web of Science][Medline]
  2. Khan IH, Campbell MK, Cantarovich D, Catto GR, Delcroix C, Edward N, Fontenaille C, Fleming LW, Gerlag PG, van Hamersvelt HW, Henderson IS, Koene RA, Papadimitriou M, Ritz E, Russell IT, Stier E, Tsakiris D, MacLeod AM. Survival on renal replacement therapy in Europe: is there a ‘centre effect’. Nephrol Dial Transplant 1996; 11: 300–307[Abstract/Free Full Text]
  3. Marcelli D, Stannard D, Conte F, Held PJ, Locatelli F, Port FK. ESRD patient mortality with adjustment for comorbid conditions in Lombardy (Italy) versus the United States. Kidney Int 1996; 50: 1013–1018[Web of Science][Medline]
  4. Huisman RM, Nieuwenhuizen MG, Th de Charro F. Practical Statistics for Medical Research. Chapman and Hall, London; 1991: 187–189
  5. Cox. Regression models and life tables. J R Stat Soc 1972; 34: 187–219
  6. Ansell D, Feest T. UK Renal Registry Report 2001. 2001. Ref Type: Report http://www.renalreg.com/Report%202000/Cover_2000_Frame.htm
  7. Metz CE. Statistical analysis of ROC data in evaluating diagnostic performance. In: Multiple regression analysis: applications in the health sciences (D Herbert and R Myers, eds.). New York: American Institute of Physics, 1986, pp. 365, http://xray.bsd.uchicago.edu/krl/roc_soft.htm. ROCKIT v.0.9.1. 2001. Computer Program
  8. Metz CE. Statistical analysis of ROC data in evaluating diagnostic performance. In: Multiple regression analysis: applications in the health sciences (D Herbert and R Myers, eds.). New York: American Institute of Physics, 1986, pp. 365, http://xray.bsd.uchicago.edu/krl/roc_soft.htm. CLABROC v.1.2.1. 2001. Computer Program
  9. Huisman et al. Patient-related and centre-related factors influencing technique survival of peritoneal dialysis in The Netherlands. Nephrol Dial Transplant 2002; 17: 1655–1660[Abstract/Free Full Text]
  10. Schaubel DE, Blake PG, Fenton SS. Effect of renal center characteristics on mortality and technique failure on peritoneal dialysis. Kidney Int 2001; 60: 1517–1524[CrossRef][Web of Science][Medline]
  11. Lisboa PJ. A review of evidence of health benefit from artificial neural networks in medical intervention. Neural Network 2002; 15:11–39
  12. Itchhaporia D, Snow PB, Almassy RJ, Oetgen WJ. Artificial neural networks: current status in cardiovascular medicine. J Am Coll Cardiol 1999; 28: 515–521
Received for publication: 1. 6.05
Accepted in revised form: 13.10.05


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nephrol Dial TransplantHome page
K. Farrington, R. Rao, R. Stenkamp, D. Ansell, and T. Feest
All patients receiving renal replacement therapy in the United Kingdom in 2005 (Chapter 4)
Nephrol. Dial. Transplant., August 1, 2007; 22(suppl_7): vii30 - vii50.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
21/3/743    most recent
gfi255v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Tangri, N.
Right arrow Articles by Naimark, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tangri, N.
Right arrow Articles by Naimark, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?