In biostatistics, spectrum bias refers to the phenomenon that the performance of a diagnostic test may vary in different clinical settings because each setting has a different mix of patients.[1] Because the performance may be dependent on the mix of patients, performance at one clinic may not be predictive of performance at another clinic.[2] These differences are interpreted as a kind of bias. Mathematically, the spectrum bias is a sampling bias and not a traditional statistical bias; this has led some authors to refer to the phenomenon as spectrum effects,[3] whilst others maintain it is a bias if the true performance of the test differs from that which is 'expected'.[2] Usually the performance of a diagnostic test is measured in terms of its sensitivity and specificity and it is changes in these that are considered when referring to spectrum bias. However, other performance measures such as the likelihood ratios may also be affected by spectrum bias.[2]

Generally spectrum bias is considered to have three causes.[2] The first is due to a change in the case-mix of those patients with the target disorder (disease) and this affects the sensitivity. The second is due to a change in the case-mix of those without the target disorder (disease-free) and this affects the specificity. The third is due to a change in the prevalence, and this affects both the sensitivity and specificity.[4] This final cause is not widely appreciated, but there is mounting empirical evidence[4][5] as well as theoretical arguments[6] which suggest that it does indeed affect a test's performance.

Examples where the sensitivity and specificity change between different sub-groups of patients may be found with the carcinoembryonic antigen test[7] and urinary dipstick tests.[8]

Diagnostic test performances reported by some studies may be artificially overestimated if it is a case-control design where a healthy population ('fittest of the fit') is compared with a population with advanced disease ('sickest of the sick'); that is two extreme populations are compared, rather than typical healthy and diseased populations.[9]

If properly analyzed, recognition of heterogeneity of subgroups can lead to insights about the test's performance in varying populations.[3]

See also


  1. ^ Ransohoff DF, Feinstein AR (1978). "Problems of spectrum and bias in evaluating the efficacy of diagnostic tests". N. Engl. J. Med. 299 (17): 926–30. doi:10.1056/NEJM197810262991705. PMID 692598.
  2. ^ a b c d Willis BH (2008). "Spectrum bias – why clinicians need to be cautious when applying diagnostic test studies". Family Practice. 25 (5): 390–96. doi:10.1093/fampra/cmn051. PMID 18765409.
  3. ^ a b Mulherin SA, Miller WC (2002). "Spectrum bias or spectrum effect? Subgroup variation in diagnostic test evaluation" (PDF). Ann. Intern. Med. 137 (7): 598–602. doi:10.7326/0003-4819-137-7-200210010-00011. PMID 12353947. S2CID 35752032.
  4. ^ a b Willis, BH (2012). "Evidence that disease prevalence may affect the performance of diagnostic tests with an implicit threshold: a cross sectional study". BMJ Open. 2 (1): e000746. doi:10.1136/bmjopen-2011-000746. PMC 3274715. PMID 22307105. Open access icon
  5. ^ Leeflang MM, Bossuyt PM, Irwig L., Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis, J Clin Epidemiol. 2009 Jan;62(1) 5–12.
  6. ^ Goehring C, Perrier A, Morabia A (2004). "Spectrum bias: a quantitative and graphical analysis of the variability of medical diagnostic test performance". Statistics in Medicine. 23 (1): 125–35. doi:10.1002/sim.1591. PMID 14695644. S2CID 24636826.
  7. ^ Fletcher RH (1986). "Carcinoembryonic antigen". Ann. Intern. Med. 104 (1): 66–73. doi:10.7326/0003-4819-104-1-66. PMID 3510056.
  8. ^ Lachs MS, Nachamkin I, Edelstein PH, Goldman J, Feinstein AR, Schwartz JS (1992). "Spectrum bias in the evaluation of diagnostic tests: lessons from the rapid dipstick test for urinary tract infection". Ann. Intern. Med. 117 (2): 135–40. doi:10.7326/0003-4819-117-2-135. PMID 1605428. S2CID 25381473.
  9. ^ Rutjes AWS, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PMM, Case-control and two-gate designs in diagnostic accuracy studies, Clin Chem 2005;51(8):1335–41.