NDT Advance Access originally published online on September 22, 2007
Nephrology Dialysis Transplantation 2007 22(10):2785-2794; doi:10.1093/ndt/gfm433
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Clinical research of kidney diseases II: problems of study design
1Clinical Epidemiology Unit, Faculty of Medicine, Memorial University of Newfoundland, Canada and 2Divisione di Nefrologia, Azienda Istituti Ospitalieri di Cremona, Italy
Correspondence and offprint requests to: Pietro Ravani, MD, Divisione di Nefrologia, Azienda Istituti Ospitalieri di Cremona, Italy, Largo priori 1, Cremona, 26100, Italy. Email: p.ravani{at}ospedale.cremona.it
| Introduction |
|---|
|
|
|---|
The aim of study design in any field of clinical inquiry is to limit bias and maximize reliability [1]. The present article introduces the types of study design currently recommended for assessing prognosis, therapy and diagnostic tests with nephrology examples. The concept of clinical relevance as opposed to statistical significance of study results is also briefly discussed.
| Study design |
|---|
|
|
|---|
Hierarchy of evidence
Fundamental to evidence-based health care is the concept of hierarchy of evidence, deriving from different study designs addressing a given research question (Figure 1). Evidence grading is based on the idea that different designs vary in their susceptibility to bias and, therefore, in their ability to predict the true effectiveness of health care practices. For assessment of interventions, randomized controlled trials (RCTs) or systematic review of good quality, RCTs are at the top of the evidence pyramid, followed by longitudinal cohort, case-control and cross-sectional studies [2,3]. However, the choice of the study design depends on the question at hand, the nature of the exposure and the frequency of the disease.
|
Intervention questions are ideally addressed with experiments (RCTs), since observational data are prone to unpredictable bias and confounding that only the randomization process will control [1]. Appropriately designed RCTs allow also stronger causal inference for disease mechanisms. However, ideal RCTs cannot be implemented in the same way to answer all intervention questions. Some therapies can even not be masked or randomly assigned (e.g. dialysis modalities). In circumstances where the intervention is clearly identified and easily applied, such as the use of a new oral medication to reduce proteinuria, both internal and external validity can be reasonably maximized using standard approaches (limited exclusion criteria, multiple blinding, minimization of missing data and dropouts). In contrast, when the intervention is aimed at achieving a clinical target, such as haemoglobin or blood pressure levels, many treatment adjustment decisions are often left to the discretion of the treatment team during the trial, blinding may be difficult to maintain and patients are often exposed to multiple strategies (e.g. iron supplementation, erythropoietic agents, anti-hypertensive medications in studies of haemoglobin targets). In such cases, practitioners may be left with uncertainty as to what aspect of the intervention led to the observed trial results. For example, if higher cardiovascular event rates were associated with aiming for higher haemoglobin targets, it might be unclear whether this was due to the dose of erythropoietic agents employed, the amount of iron given or indeed the interaction between these factors and characteristics of the trial subjects. Those at higher baseline cardiovascular risk might be more difficult to get to target and particularly susceptible to the adverse effects associated with higher doses of iron and erythropoietic agents given in an effort to achieve those targets. However, understanding these relationships as a result of a trial, particularly if confirmed in further research helps inform practitioners on how to best individualize the application of therapy.
Prognostic and aetiologic questions are best addressed with longitudinal cohort studies, in which exposure is measured first and participants are followed forward in time. At least two (and possibly more) waves of measurements over time are undertaken. Initial assessment of an input–output relationship may derive from case-control studies, where the direction of the study is reversed. Participants are identified by the presence or absence of disease and exposure is assessed retrospectively. Cross-sectional studies may be appropriate for an initial evaluation of the accuracy of new diagnostic tests as compared to a gold standard. Further assessments of diagnostic programmes are performed with longitudinal studies (observational and experimental). Common biases afflicting observational designs are summarized in Table 1.
|
Additional biases in longitudinal designs
In prognostic studies, as well as in most RCTs, the outcome measure is usually time to an event of interest that can be death, a better or worse disease stage, or a complication or recovery from an illness condition. Among the possible threats to internal validity of a study [1], loss to follow-up, drop-outs and attrition bias can induce important errors in the measurements of this outcome variable and, consequently, in the derived risk estimates (Figure 2).
|
The risk of any event is a probability (thus with no dimension and with possible values ranging from 0 to 1), and cannot be directly measured in any single person, since an individual either does or does not develop that event. Rather, the risk is estimated as the proportion of subjects developing the event of interest (D) among a larger group of people (N) who are disease-free at the beginning of the study, and thus at risk over a certain period of time. The resulting incidence proportion (D/N) estimates the individual risk of getting the disease in that period. For example, an observed risk of End-Stage Renal Disease (ESRD) of 0.1 in 10 years in a group of subjects means that each subject of that group has a probability of 10% of developing ESRD in 10 years. It is clear that the definition of the time interval over which the risk applies is fundamental to the interpretation of risk and to proper planning of a prognostic study. In fact, a risk can be thought of as the speed with which the phenomenon can occur in the population. If the risk of ESRD is 0.1 in 10 years in one group and 0.1 in 20 years in another, the speed is twice as high in the first group.
The speed of the disease process has implications for the study design. In fact the faster the evolution of the disease, the shorter the study can be, and the likelihood that any individual leaves the study before the end of the observation period without experiencing the event of interest is lower. Studies of acute illnesses such as pyelonephritis or complications such as contrast media nephropathy, are usually of short duration. In these studies, strategies to reduce the risk of losing patients during follow-up are likely to be successful. When the probability of leaving the study earlier without event is low, the outcome measure is a valid estimate of the true risk, because the denominator of the ratio is not substantially affected (Figure 2, left panel). When the study is longer (e.g. time to ESRD or cardiovascular complications), incidence rates are estimated rather than incidence proportions, because more people can be lost to follow-up for several reasons (unknown, competing risks, moving), and new people are often enrolled to maintain the size of the cohort. These incidence rates have as numerator the number of events (D) and person-time as denominator (Figure 2, right panel). Incidence rates have a range of values from 0 to
(depending on the unit of time chosen) and the dimension of 1/time.
The risk can be estimated from the incidence rate using special techniques called survival analyses. However, these techniques do not provide valid estimates of the true risk if the reasons for leaving the study prematurely are related to the exposure (side effects of treatments for example), or the event (earlier manifestations of the final outcome, such as mild symptoms of cardiovascular events). This phenomenon is called informative censoring in survival analysis terminology. Attrition bias may result not only from differential drop-out rates, but also from differential distribution of the reasons for withdrawal. Strategies should be considered for limiting loss to follow-up during the study implementation and careful data reporting once the study is completed [4]. This is problematic in prognostic studies, but may occur also in RCTs. For example, the CHOIR trial compared normalization of haemoglobin with erythropoietin in patients with chronic kidney disease with partial correction of anaemia. Limitations of this study were the extremely high overall drop-out rate and failure to report the reasons for participant withdrawal by exposure level [5]. Biased estimates may also occur if the characteristics of the participants entering the study or the study conditions change over time. For example, a recent study of factors impacting outcomes in atheroembolic renal disease analysed data collected over 20 years [6]. It is possible that milder forms of the disease were more likely to be recognized late in the study as a result of the awareness and experience of the investigators (Will Rogers phenomenon).
Lead-time bias and length-time bias are errors related to the natural history of the disease and timing of diagnosis (Figure 3). Lead-time bias occurs when diagnosis is made earlier than usual in a group of patients, independently of disease progression, such as in early referrals [7]. Measuring survival from dialysis initiation makes prognosis appear better in those who started dialysis with better renal function [8]. Length-time bias occurs when there is a differential distribution of subgroups by level of exposure to a risk factor, where the subgroups have the same disease, but different rates of progression (from biologic onset to death). A higher speed of progression may reduce the likelihood of timely diagnosis with consequent under-representation of faster progressors and overestimation of the survival times depending on the study design. For example, those with persistent heavy proteinuria would be expected to have a shorter length of time between disease onset and ESRD than those with lesser degrees of proteinuria. In a prognostic study of a proteinuric disease, length time bias might occur if prevalent cases were recruited. Prognosis would appear more benign than in reality, since such prevalent case samples contain a smaller proportion of subjects with heavy proteinuria than samples of incident patients. Similarly, screening programmes for chronic diseases tend to detect more subjects with slowly progressive forms and longer pre-clinical phases. Length time bias may partly explain the apparent survival advantage observed in non-experimental studies comparing screening programmes to routine clinical care [9].
|
This may also be problematic in RCTs of prevalent rather than incident patients, because the prevalent group would have lower overall basal risk of the event of interest, and consequently lower study power, increasing the risk of false negative results [1]. For example, in the CREATE trial of different haemoglobin targets in chronic kidney disease, the annual event rate was lower than expected (6% vs 15%). Volunteer bias and Hawthorne effect (whereby the control group performs better than expected) may have played a role. However, the study enrolled also prevalent subjects, whereas the sample size was estimated from event rates in incident studies [10].
Experimental designs for intervention questions
The RCT design is appropriate for assessment of clinical effects of drugs, procedures, or care processes, definition of target levels in risk factor modification (e.g. blood pressure, lipid levels and proteinuria), and assessment of the impact of screening programmes [1]. Comparison to a placebo may be appropriate if no current standard therapy exists. When accepted therapies exist (e.g. statins as lipid lowering agents, ACE-I for chronic kidney disease progression, etc), the comparison is an active control group that receives usual or recommended therapy.
Figure 1 shows an example of the most common type of RCT, the two group parallel-arm trial. However, trials can compare any number of groups. In factorial trials at least two active therapies (A; B) and their combination (AB) are compared with a control (C). Factorial designs can be efficient since more therapies are simultaneously tested in the same study. However, the efficiency and the appropriate sample size are affected by the impact of multiple testing on both type I and type II error, and whether there is an interaction between the effects of the therapies. In the absence of interaction, the effect of A, for example, can be determined by comparing A + AB to B + C. Interactions where use of A enhances the effectiveness of B, for example, do not reduce the power of the study. However, if there is antagonism between treatments, the sample size can be inadequate [1].
The HEMO study used a two-by-two factorial design, and tested two interventions, with no interaction assumption [11]. The trial failed to show the existence of a 25% reduction in the risk of death for either intervention: higher vs standard dialysis dose or use of high vs low flux membranes [11]. The AASK trial had a two-by-three factorial design (six groups) testing the effect on renal function decline (primary outcome) and on composite end-points (time to renal function halving, ESRD, or death) of two blood pressure levels by three anti-hypertensive treatments (Ramipril, Metoprolol, Amlodipine) with no interaction assumption [12]. Since there were multiple possible comparisons, three primary treatment comparisons were pre-specified: lower vs usual blood pressure goals, Ramipril vs Metoprolol and Amlodipine vs Metoprolol. The only significant findings reported in this study should be considered with caution since (i) they were effects on the secondary outcome and the study power is estimated on the primary outcome measure; (ii) the level of significance of one of these effect (Ramipril vs Metoprolol) was only P = 0.04 (non-significant after considering multiple testing) and (iii) Ramipril vs Amplodipine had not been pre-specified [12].
The cross-over design is an alternative solution when the outcome is reversible. In this design, each participant serves as their own control by receiving each treatment in a randomly specified sequence. A washout period is used between treatments, to prevent carryover of the effect of the first treatment to the subsequent periods. The design is efficient in that treatments are compared within individuals, reducing the variation or noise due to subject differences. However, limitations include possible differential carryover (one of the treatments tends to have a longer effect once stopped); period effects (different response of disease to early versus later therapy), and a greater impact of missing data because they compromise within subject comparison and therefore variance reduction [3]. For example, Schjoedt et al. [13] used a cross-over design, to test whether spironolactone reduces proteinuria in diabetic subjects with nephrotic syndrome. Patients were treated in random order with spironolactone 25 mg once daily and matched placebo for 2 months, in addition to ongoing antihypertensive treatment, including an angiotensin-converting enzyme inhibitor or an angiotensin II receptor blocker. No washout period was planned between the two treatment periods, although the hypothesis of no carryover does not seem to be biologically tenable, considering the mechanism of action of the drug. Instead, the investigators searched for evidence of carryover. This was excluded based on statistical testing. However, the assumption underlying this approach (no carryover in absence of statistical support) is questionable, since such tests have limited power [1,14].
Finally, RCTs may attempt to show that one treatment is not-inferior (sometimes incorrectly called equivalence) rather than to establish its superiority to a comparable intervention [15]. These studies are often done when new agents are being added to a class (e.g. another ACE inhibitor), or when a new therapy is already known to be cheaper or safer than an existing standard. In such RCTs, the study size is estimated based on a pre-specified maximum difference that would still be considered irrelevant. For example, the claim might be made that a new ACE inhibitor is non-inferior to Enalapril, if the mean 24 h blood pressure difference between them was no more than 3 mmHg. Non-inferiority trials have been criticized, as imperfections in study execution, which tend to prevent detection of a difference between treatments, actually work in favour of a conclusion of non-inferiority. Thus, in distinction to the usual superiority trial, poorly done studies may lead to the desired outcome for the study sponsor.
Designs for diagnostic questions
When assessing a diagnostic test the reference or gold standard tests for the suspected target disorders are often either inaccessible to clinicians or avoided for reasons of cost or risk. Therefore the relationship between more easily measured phenomena (patient history, physical and instrumental examination, and levels of constituents of body fluids and tissues) and the final diagnosis is an important subject of clinical research. Unfortunately, even the most promising diagnostic tests are never completely accurate. For tests with continuous outcome values, such as serum sodium concentration, clinicians need to know reference (normal) values to identify disease. From an epidemiological perspective, these reference values are best defined based on the diagnostic relevance rather than distributional assumptions (Gaussian for example). In other words, by chance, a fraction of a population without disease will have a test result that differs from the mean by some amount. However, the test becomes useful to clinicians when unusually high or low values are generally associated with some clinical condition of importance. For example, reference values of troponin T and I have been established in outcome studies of subject with suspected myocardial infarction, rather than assessing their distribution in the general population.
Clinical implications of test results should ideally be assessed in four types of diagnostic studies. Table 2 shows examples from troponin studies in coronary syndromes. As a first step, one might compare test results among those known to have established disease, to results from those disease-free [16]. Cross-sectional studies can address this question (Figure 1). However, since the direction of interpretation is from diagnosis back to the test, the results do not assess test performance. To examine test performance (Table 3) requires data on whether those with positive test results are more likely to have the disease than those with normal results [17]. When the test variable is not binary (i.e. when it can assume more than two values) it is possible to assess the trade-off between sensitivity and specificity at different test result cut-off points. In such instances, classification into just two groups is wasteful of information. Distinction of at least three classes is more useful. For example, a Dutch study identified three levels of serum creatinine in hypertensive subjects (
70, 71–110, >110 µmol/l) associated with likelihood ratios of renal artery stenosis of 0.31, 0.77 and 4, respectively [18]. This means that the third category gives reasonable evidence for stenosis, the first against stenosis and the intermediate is uninformative, as likelihood ratios between 0.5 and 2 are considered uninformative. The Receiver Operating Characteristics (ROC) plot is one way to investigate to what extent the test results differ among people who do or do not have the diagnosis of interest without requiring any data grouping [19]. The ROC curve is a plot obtained computing sensitivity and specificity for every distinct observed test value and plotting sensitivity against 1—specificity. Diagnostic test accuracy is assessed estimating the area under the ROC curve (AUC), which corresponds to the probability that a random person with the disease has a higher test value than a random person without disease. In other words, if the test has an AUC of 0.8 and results are used to distinguish which of the two persons has the disease, the test will be right 80% of the times. The area is 1 for perfect tests and 0.5 for uninformative tests.
|
|
In all these diagnostic studies, it is crucial to ensure independent blind assessment of results of the test being assessed and the gold standard to which it is compared, without the completion of either being contingent on results of the other.
Longitudinal studies are required to assess diagnostic tests aimed at predicting future prognosis or development of established disease [17]. The most stringent evaluation of a diagnostic test is to determine whether those tested have more rapid and accurate diagnosis, and as a result better health outcomes, than those not tested. The RCT design is the proper tool to answer this type of question [10,20].
A final issue of great interest for nephrologists is the applicability of findings from different settings to the renal population. The performance of cardiac markers such as troponin, for diagnosis of acute coronary syndromes, is less accurate in patients with kidney disease than in those with more normal kidney function [21], although their prognostic value is generally maintained [22].
| Maximizing the validity of non-experimental studies |
|---|
|
|
|---|
When randomization is not feasible, the knowledge of the most important sources of bias is important, to increase the validity of any study. This may happen for a variety of reasons: when study participants cannot be assigned to intervention groups by chance either for ethical reasons (e.g. in a study of smoking), or participant willingness (e.g. comparing haemo- to peritoneal dialysis), the exposure is fixed (e.g. gender), or the disease is rare and participants cannot be enrolled in a timely manner. When strategies are in place to prevent bias, non-experimental studies have been shown to yield similar results to rigorous RCTs [23]. These strategies are summarized in Table 4. However, also in non-experimental studies, strategies that maximize internal validity tend to reduce generalizability of the results, and vice versa [1]. For example, among the most common confounding factors in hard outcome studies, age, gender and race can be more easily defined and more consistently and accurately measured than other cardiovascular risk factors (hypertension, dyslipidaemia, smoking, physical exercise, body mass index) and important comorbid conditions (diabetes, cardiovascular disease, malignancies). This has implications for the cost and complexity of efforts to increase internal validity by controlling for confounders (e.g. collection of detailed information about smoking and multiple measurements of cholesterol levels over time might be required). Furthermore reducing confounding, by participant selection based on strict eligibility criteria, limits applicability of the results (e.g. exclusion of some ethnic groups, patients with systemic diseases or worse prognosis).
|
| Research questions in genetic epidemiology |
|---|
|
|
|---|
Genetic disorders often present additional challenges to those who design clinical studies and may require adaptation of methods or specific solutions. Definition of the start time in longitudinal studies, and identification of patient and controls to compare outcomes are three key issues. The first is usually addressed using the birth date as time zero for survival analysis. The second can be solved by enrolling incident patients to prevent survivor bias. For example, diagnostic and prognostic questions were addressed in a study comparing time to ESRD in the two main genetic forms of adult (autosomal dominant) polycystic kidney disease, ADPKD1 and ADPKD2 [24]. The two main challenges of that study were the definition of the families representative of the population at risk and the identification of carriers. Probands were identified by all nephrologists in the community and pedigrees were constructed to identify all individuals at 50% risk of having autosomal-dominant polycystic kidney disease (ADPKD). Genetic testing was considered the gold standard to identify cases, and when genetic testing was not possible, renal ultrasound using Ravine's criteria was adopted [25]. This test was a reliable indicator of inherited ADPKD in adults who were 30 years or older [24]. Depending on pedigree position, obligate carrier status was demonstrated in some individuals. Thus it was possible to identify most families with ADPKD in the community and to enrol incident family members who carried the ADPKD mutation, and make a reliable prediction of outcome.
Recruitment of participants when the disorder is rare is a problem, because the low frequency of genetic diseases often requires the use of case control designs (retrospective) or longitudinal historical cohort studies [26]. An obvious limitation of this type of studies is that changes in diagnostic criteria and health care over time can influence apparent prognosis.
A final issue is the identification of the appropriate controls to compare outcomes. For some genetic diseases that are not immediately lethal, such as ADPKD, outcomes can be assessed by randomized trials or prospective studies [24]. In rare disorders, matching techniques are often used in choosing appropriate comparison groups when patients cannot be randomly assigned to therapy. For example, a cohort study was conducted to assess the benefits of an implantable cardioverter defibrillator (ICD) in Arrhythmogenic Right Ventricular Cardiomyopathy (ARVC), an autosomal dominant condition that causes sudden cardiac death [27]. The survival of patients with the disorder who received the ICD (cases) was compared with a non-randomly assigned control group, both for practical (low frequency of the disease) and ethical reasons (absence of alternative treatments to prolong survival of affected individuals). To prove that the intervention improved survival, a control group was assembled from family members who carried the ARVD mutation, who did not have an ICD implant matched for age, gender and family. To increase comparability, the controls had to be first or second degree relatives of the cases receiving ICD implantation to reduce genetic variation; had to be of the same gender because survival was worse in males and had to have survived up to the age that the ICD was implanted in the cases. This strategy demonstrated that the survival benefit of ICD was such as to make it a dominant strategy, despite the bias associated with the enrolment of some historical controls.
| Clinical relevance vs statistical significance |
|---|
|
|
|---|
The concepts of clinical relevance and statistical significance are often confused. Clinical relevance refers to the amount of benefit or harm resulting from an exposure or intervention sufficient to change clinical practice or health policy. In planning study sample size, the researcher has to determine the minimum level of effect that would have clinical relevance [1]. The level of statistical significance chosen is the probability that the observed results are due to chance alone. This will correspond to the probability of making a type I error, i.e. claiming an effect when in fact there is none. By convention, this probability is usually 0.05 (but can be as low as 0.01). The P-value or the limits of the appropriate confidence interval (a 95% interval is equivalent to a significance level of 0.05 for example) is examined, to see if the results of the study might be explained by chance. If P < 0.05, the null hypothesis of no effect is rejected in favour of the study hypothesis, despite it still being possible that the observed results are simply due to chance. However, since statistical significance depends on both the magnitude of effect and the sample size, trials with very large sample sizes can theoretically detect statistically significant but very small effects, that are of no clinical relevance.
Figure 4 summarizes the two problems related to the confusion surrounding clinical relevance and statistical significance. Two aspects must be considered: the effect measure chosen to demonstrate the importance of the effect (Figure 4, left panel) and the distinction between the chosen level of clinical relevance and statistical significance (Figure 4, right panel). This is important, since results may be statistically positive (do not support the null hypothesis) but clinically ambiguous (do not support the clinical hypothesis).
|
| Reporting |
|---|
|
|
|---|
Adequate reporting is critical to the proper interpretation and evaluation of any study results. Guidelines for reporting primary (CONSORT, STROBE and STARD for example) and secondary studies (QUORUM) are in place to help both investigators and consumers of clinical research [29–32]. Scientific reports may not fully reflect how the investigators conducted their studies, but the quality of the scientific report is a reasonable marker for how the overall project was conducted. The interested reader is referred to the above-referenced citations, for more details of what to look for in reports from prognostic, diagnostic and intervention studies.
| Acknowledgements |
|---|
|
|
|---|
P.R. held a young investigator award from the Italian Society of Nephrology for the year 2005-2006 and received funding from the EU (Marie Curie Actions-OIF, proposal #021676) for the year 2006-2007.
Conflict of interest statement. None declared.
| Notes |
|---|
See http://www.oxfordjournals.org/our_journals/ndtplus/
| References |
|---|
|
|
|---|
- Ravani P, Curtis B, Parfrey PS, Barrett BJ. Clinical research of kidney diseases I: researchable questions and valid answers. Nephrol Dial Transplant (2007) XX:zzz–zzz.
- last accessed March 23, 2007. http://www.cebm.net/levels_of_evidence.asp.
- last accessed March 23, 2007. http://www.cebm.utoronto.ca/index.htm.
- Keough-Ryan T, Hutchinson T, MacGibbon B, Senecal M. Studies of prognostic factors in end-stage renal disease: an epidemiological and statistical critique. Am J Kidney Dis (2002) 39:1196–1205.[CrossRef][Web of Science]
- Singh AK, Szczech L, Tang KL, et al. CHOIR investigators: correction of anemia with epoetin alfa in chronic kidney disease. N Engl J Med (2006) 16:2085–2098.
- Scolari F, Ravani P, Gaggi R, et al. The challenge of diagnosing atheroembolic renal disease: clinical features and prognostic factors. Circulation. (in press).
- Lameire N, Wauters JP, Teruel JL, Van Biesen W, Vanholder R. An update on the referral pattern of patients with end-stage renal disease. Kidney Int (2002) 80:27–34.
- Lameire N, Biesen WV, Vanholder R. Initiation of dialysis–is the problem solved by NECOSAD? Nephrol Dial Transplant (2002) 17:1550–1552.
[Free Full Text] - Hewitson P, Glasziou P, Irwig L, Towler B, Watson E. Screening for colorectal cancer using the faecal occult blood test, Hemoccult. Cochrane Database Syst Rev (2007) 24:CD001216.
- Drueke T, Locatelli F, Clyne N, et al. Normalization of hemoglobin level in patients with chronic kidney disease and anemia. New Engl J Med (2006) 355:2071–2208.
[Abstract/Free Full Text] - Eknoyan G, Beck GJ, Cheung AK, et al. Hemodialysis (HEMO) Study Group: effect of dialysis dose and membrane flux in maintenance hemodialysis. N Engl J Med (2002) 347:2010–2019.
[Abstract/Free Full Text] - Wright JT Jr, Bakris G, Greene T, et al. African American Study of Kidney Disease and Hypertension Study Group: effect of blood pressure lowering and antihypertensive drug class on progression of hypertensive kidney disease: results from the AASK trial. JAMA (2002) 288:2421–2431.
[Abstract/Free Full Text] - Schjoedt KJ, Rossing K, Juhl TR, et al. Beneficial impact of spironolactone on nephrotic range albuminuria in diabetic nephropathy. Kidney Int (2006) 70:536–542.[Web of Science][Medline]
- Sibbald B, Roberts C. Understanding controlled trials: crossover trials. Br Med J (1998) 316:1719.
[Free Full Text] - Salvadori M, Holzer H, de Mattos A, et al. The ERL B301 Study Groups: enteric-coated mycophenolate sodium is therapeutically equivalent to mycophenolate mofetil in de novo renal transplant patients. Am J Transplant (2004) 4:231–236.[CrossRef][Web of Science][Medline]
- Majeed R, Jaleel A, Siddiqui IA, Sandila P, Baseer A. Comparison of troponin T and enzyme levels in acute myocardial infarction and skeletal muscle injury. J Ayub Med Coll Abbottabad (2002) 14:5–7.[Medline]
- Antman EM, Grudzien C, Sacks DB. Evaluation of a rapid bedside assay for detection of serum cardiac troponin T. JAMA (1995) 273:1279–1282.
[Abstract/Free Full Text] - Habbema JDF, Eijkemans R, Krijnen, Knottnerus JA. Analysis of data on the accuracy of diagnostic tests. In: In: The Evidence Base of Clinical Diagnosis (2002) London: BMJ Books. ; 117–144.
- Sackett DL, Haynes RB, Guyatt GH, Tugwell P. The interpretation of diagnostic data. In: Clinical Epidemiology, a Basic Science for Clinical Medicine—Sackett DL, Haynes RB, Guyatt GH, Tugwell P, eds. (1991) Toronto, CA: Little, Brown and Company. ; 117–119.
- Alp NJ, Bell JA, Shahi M. A rapid troponin-I-based protocol for assessing acute chest pain. Q J Med (2001) 94:687–694.[Web of Science]
- Mockel M, Schindler R, Knorr L, et al. Prognostic value of cardiac troponin T and I elevations in renal disease patients without acute coronary syndromes: a 9-month outcome analysis. Nephrol Dial Transplant (1999) 14:1489–1495.
[Abstract/Free Full Text] - Aviles RJ, Askari AT, Lindahl B, et al. Troponin T levels in patients with acute coronary syndromes, with or without renal dysfunction. N Engl J Med (2002) 346:2047–2052.
[Abstract/Free Full Text] - Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med (2000) 342:1887–1892.
[Abstract/Free Full Text] - Dicks E, Ravani P, Langman D, Davidson WS, Pei Y, Parfrey PS. Incident renal events and risk factors in autosomal dominant polycystic kidney disease: a population and family-based cohort followed for 22 years. Clin J Am Soc Nephrol (2006) 1:710–717.
[Abstract/Free Full Text] - Ravine D, Gibson RN, Walker RG, Sheffield LJ, Kincaid-Smith P, Danks DM. Evaluation of ultrasonographic diagnostic criteria for autosomal dominant polycystic kidney disease. Lancet (1994) 343:824–826.[CrossRef][Web of Science][Medline]
- Moore SJ, Green JS, Fan Y, et al. Clinical and genetic epidemiology of Bardet-Biedl syndrome in Newfoundland: a 22-year prospective, population-based, cohort study. Am J Med Genet (2005) 132:352–360.
- Hodgkinson KA, Parfrey PS, Bassett AS, et al. The impact of implantable cardioverter-defibrillator therapy on survival in autosomal-dominant arrhythmogenic right ventricular cardiomyopathy (ARVD5). J Am Coll Cardiol (2005) 45:400–408.
[Abstract/Free Full Text] - Altman DG, Andersen PK. Calculating the number needed to treat for trials where the outcome is time to an event. Br Med J (1999) 319:1492–1495.
[Free Full Text] - Moher D, Schulz KF, Altman DG, for the CONSORT Group. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. Lancet (2001) 357:1191–1194.[CrossRef][Web of Science][Medline]
- last accessed March 23, 2007. http://www.strobe-statement.org/.
- last accessed March 23, 2007. http://www.consort-statement.org/stardstatement.htm.
- David Moher, Deborah J Cook, Susan Eastwood, Ingram Olkin, Drummond Rennie, Donna F Stroup, for the QUOROM Group. Improving the quality of reports of meta-analyses of randomized controlled trials: the QUOROM statement. Lancet (1999) 354:1896–1900.[CrossRef][Web of Science][Medline]
Related articles in NDT:
- Clinical research of kidney diseases II: problems of study design
- Pietro Ravani, Patrick S. Parfrey, Elizabeth Dicks, and Brendan J. Barrett
NDT 2008 10.1093/ndt/gfn199.[Extract] [FREE Full Text]
This article has been cited by other articles:
![]() |
P. Ravani, P. Parfrey, S. Murphy, V. Gadag, and B. Barrett Clinical research of kidney diseases IV: standard regression models Nephrol. Dial. Transplant., February 1, 2008; 23(2): 475 - 482. [Full Text] [PDF] |
||||
![]() |
P. Ravani, P. S. Parfrey, B. Curtis, and B. J. Barrett Clinical research of kidney diseases 1: researchable questions and valid answers Nephrol. Dial. Transplant., December 1, 2007; 22(12): 3681 - 3690. [Full Text] [PDF] |
||||
![]() |
P. Ravani, P. Parfrey, V. Gadag, F. Malberti, and B. Barrett Clinical research of kidney diseases III: Principles of regression and modelling Nephrol. Dial. Transplant., December 1, 2007; 22(12): 3422 - 3430. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




