Pre- and post-test probability
Pre-test probability and post-test probability are the probabilities of the presence of a condition before and after a diagnostic test, respectively. Post-test probability, in turn, can be positive or negative, depending on whether the test falls out as a positive test or a negative test, respectively. In some cases, it is used for the probability of developing the condition of interest in the future.
Test, in this sense, can refer to any medical test, and in a broad sense also including questions and even assumptions. The ability to make a difference between pre- and post-test probabilities of various conditions is a major factor in the indication of medical tests.
Pre-test probability
The pre-test probability of an individual can be chosen as one of the following:- The prevalence of the disease, which may have to be chosen if no other characteristic is known for the individual, or it can be chosen for ease of calculation even if other characteristics are known although such omission may cause inaccurate results
- The post-test probability of the condition resulting from one or more preceding tests
- A rough estimation, which may have to be chosen if more systematic approaches are not possible or efficient
Estimation of post-test probability
In reality, however, the subjective probability of the presence of a condition is never exactly 0 or 100%. Yet, there are several systematic methods to estimate that probability. Such methods are usually based on previously having performed the test on a reference group in which the presence or absence on the condition is known, in order to establish data of test performance. These data are subsequently used to interpret the test result of any individual tested by the method. An alternative or complement to reference group-based methods is comparing a test result to a previous test on the same individual, which is more common in tests for monitoring.
The most important systematic reference group-based methods to estimate post-test probability includes the ones summarized and compared in the following table, and further described in individual sections below.
Method | Establishment of performance data | Method of individual interpretation | Ability to accurately interpret subsequent tests | Additional advantages |
By predictive values | Direct quotients from reference group | Most straightforward: Predictive value equals probability | Usually low: Separate reference group required for every subsequent pre-test state | Available both for binary and continuous values |
By likelihood ratio | Derived from sensitivity and specificity | Post-test odds given by multiplying pretest odds with the ratio | Theoretically limitless | Pre-test state does not have to be same as in reference group |
By relative risk | Quotient of risk among exposed and risk among unexposed | Pre-test probability multiplied by the relative risk | Low, unless subsequent relative risks are derived from same multivariate regression analysis | Relatively intuitive to use |
By diagnostic criteria and clinical prediction rules | Variable, but usually most tedious | Variable | Usually excellent for all test included in criteria | Usually most preferable if available |
By predictive values
s can be used to estimate the post-test probability of an individual if the pre-test probability of the individual can be assumed roughly equal to the prevalence in a reference group on which both test results and knowledge on the presence or absence of the condition are available.If the test result is of a binary classification into either positive or negative tests, then the following table can be made:
Pre-test probability can be calculated from the diagram as follows:
Pretest probability = / Total sample
Also, in this case, the positive post-test probability, is numerically equal to the positive predictive value, and the negative post-test probability is numerically complementary to the negative predictive value, again assuming that the individual being tested does not have any other risk factors that result in that individual having a different pre-test probability than the reference group used to establish the positive and negative predictive values of the test.
In the diagram above, this positive post-test probability, that is, the posttest probability of a target condition given a positive test result, is calculated as:
Positive posttest probability = True positives /
Similarly:
The post-test probability of disease given a negative result is calculated as:
Negative posttest probability = False negatives /
The validity of the equations above also depend on that the sample from the population does not have substantial sampling bias that make the groups of those who have the condition and those who do not substantially disproportionate from corresponding prevalence and "non-prevalence" in the population. In effect, the equations above are not valid with merely a case-control study that separately collects one group with the condition and one group without it.
By likelihood ratio
The above methods are inappropriate to use if the pretest probability differs from the prevalence in the reference group used to establish, among others, the positive predictive value of the test. Such difference can occur if another test preceded, or the person involved in the diagnostics considers that another pretest probability must be used because of knowledge of, for example, specific complaints, other elements of a medical history, signs in a physical examination, either by calculating on each finding as a test in itself with its own sensitivity and specificity, or at least making a rough estimation of the individual pre-test probability.In these cases, the prevalence in the reference group is not completely accurate in representing the pre-test probability of the individual, and, consequently, the predictive value is not completely accurate in representing the post-test probability of the individual of having the target condition.
In these cases, a posttest probability can be estimated more accurately by using a likelihood ratio for the test. Likelihood ratio is calculated from sensitivity and specificity of the test, and thereby it does not depend on prevalence in the reference group, and, likewise, it does not change with changed pre-test probability, in contrast to positive or negative predictive values. Also, in effect, the validity of post-test probability determined from likelihood ratio is not vulnerable to sampling bias in regard to those with and without the condition in the population sample, and can be done as a case-control study that separately gathers those with and without the condition.
Estimation of post-test probability from pre-test probability and likelihood ratio goes as follows:
- Pretest odds =
- Posttest odds = Pretest odds * Likelihood ratio
- Posttest probability = Posttest odds /
The post-test probability can, in turn, be used as pre-test probability for additional tests if it continues to be calculated in the same manner.
It is possible to do a calculation of likelihood ratios for tests with continuous values or more than two outcomes which is similar to the calculation for dichotomous outcomes. For this purpose, a separate likelihood ratio is calculated for every level of test result and is called interval or stratum specific likelihood ratios.
Example
An individual was screened with the test of fecal occult blood to estimate the probability for that person having the target condition of bowel cancer, and it fell out positive. Before the test, that individual had a pre-test probability of having bowel cancer of, for example, 3%, as could have been estimated by evaluation of, for example, the medical history, examination and previous tests of that individual.The sensitivity, specificity etc. of the FOB test were established with a population sample of 203 people, and fell out as follows:
From this, the likelihood ratios of the test can be established:
- Likelihood ratio positive = sensitivity / = 66.67% / = 7.4
- Likelihood ratio negative = / specificity = / 91% = 0.37
- Pretest probability = 0.03
- Pretest odds = 0.03 / = 0.0309
- Positive posttest odds = 0.0309 * 7.4 = 0.229
- Positive posttest probability = 0.229 / = 0.186 or 18.6%
The prevalence in the population sample is calculated to be:
- Prevalence = / 203 = 0.0148 or 1.48%
Specific sources of inaccuracy
Specific sources of inaccuracy when using likelihood ratio to determine a post-test probability include interference with determinants or previous tests or overlap of test targets, as explained below:Interference with test
Post-test probability, as estimated from the pre-test probability with likelihood ratio, should be handled with caution in individuals with other determinants than the general population, as well as in individuals that have undergone previous tests, because such determinants or tests may also influence the test itself in unpredictive ways, still causing inaccurate results. An example with the risk factor of obesity is that additional abdominal fat can make it difficult to palpate abdominal organs and decrease the resolution of abdominal ultrasonography, and similarly, remnant barium contrast from a previous radiography can interfere with subsequent abdominal examinations, in effect decreasing the sensitivities and specificities of such subsequent tests. On the other hand, the effect of interference can potentially improve the efficacy of subsequent tests as compared to usage in the reference group, such as some abdominal examinations being easier when performed on underweight people.Overlap of tests
Furthermore, the validity of calculations upon any pre-test probability that itself is derived from a previous test depend on that the two tests do not significantly overlap in regard to the target parameter being tested, such as blood tests of substances belonging to one and the same deranged metabolic pathway. An example of the extreme of such an overlap is where the sensitivity and specificity has been established for a blood test detecting "substance X", and likewise for one detecting "substance Y". If, in fact, "substance X" and "substance Y" are one and the same substance, then, making a two consecutive tests of one and the same substance may not have any diagnostic value at all, although the calculation appears to show a difference. In contrast to interference as described above, increasing overlap of tests only decreases their efficacy. In the medical setting, diagnostic validity is increased by combining tests of different modalities to avoid substantial overlap, for example in making a combination of a blood test, a biopsy and radiograph.Methods to overcome inaccuracy
To avoid such sources of inaccuracy by using likelihood ratios, the optimal method would be to gather a large reference group of equivalent individuals, in order to establish separate predictive values for use of the test in such individuals. However, with more knowledge of an individual's medical history, physical examination and previous test etc. that individual becomes more differentiated, with increasing difficulty to find a reference group to establish tailored predictive values, making an estimation of post-test probability by predictive values invalid.Another method to overcome such inaccuracies is by evaluating the test result in the context of diagnostic criteria, as described in the next section.
By relative risk
Post-test probability can sometimes be estimated by multiplying the pre-test probability with a relative risk given by the test. In clinical practice, this is usually applied in evaluation of a medical history of an individual, where the "test" usually is a question regarding various risk factors, for example, sex, tobacco smoking or weight, but it can potentially be a substantial test such as putting the individual on a weighing scale. When using relative risks, the resultant probability is usually rather related to the individual developing the condition over a period of time, instead of being the probability of an individual of having the condition in the present, but can indirectly be an estimation of the latter.Usage of hazard ratio can be used somewhat similarly to relative risk.
One risk factor
To establish a relative risk, the risk in an exposed group is divided by the risk in an unexposed group.If only one risk factor of an individual is taken into account, the post-test probability can be estimated by multiplying the relative risk with the risk in the control group. The control group usually represents the unexposed population, but if a very low fraction of the population is exposed, then the prevalence in the general population can often be assumed equal to the prevalence in the control group. In such cases, the post-test probability can be estimated by multiplying the relative risk with the risk in the general population.
For example, the incidence of breast cancer in a woman in the United Kingdom at age 55 to 59 is estimated at approximately 280 cases per 100.000 per year, and the risk factor of having been exposed to high-dose ionizing radiation to the chest confers a relative risk of breast cancer between 2.1 and 4.0, compared to unexposed. Because a low fraction of the population is exposed, the prevalence in the unexposed population can be assumed equal to the prevalence in the general population. Subsequently, it can be estimated that a woman in the United Kingdom that is aged between 55 and 59 and that has been exposed to high-dose ionizing radiation should have a risk of developing breast cancer over a period of one year of between 588 and 1.120 in 100.000.
Multiple risk factors
Theoretically, the total risk in the presence of multiple risk factors can be roughly estimated by multiplying with each relative risk, but is generally much less accurate than using likelihood ratios, and is usually done only because it is much easier to perform when only relative risks are given, compared to, for example, converting the source data to sensitivities and specificities and calculate by likelihood ratios. Likewise, relative risks are often given instead of likelihood ratios in the literature because the former is more intuitive. Sources of inaccuracy of multiplying relative risks include:- Relative risks are affected by the prevalence of the condition in the reference group, and this issue results in that the validity of post-test probabilities become less valid with increasing difference between the prevalence in the reference group and the pre-test probability for any individual. Any known risk factor or previous test of an individual almost always confers such a difference, decreasing the validity of using relative risks in estimating the total effect of multiple risk factors or tests. Most physicians do not appropriately take such differences in prevalence into account when interpreting test results, which may cause unnecessary testing and diagnostic errors.
- A separate source of inaccuracy of multiplying several relative risks, considering only positive tests, is that it tends to overestimate the total risk as compared to using likelihood ratios. This overestimation can be explained by the inability of the method to compensate for the fact that the total risk cannot be more than 100%. This overestimation is rather small for small risks, but becomes higher for higher values. For example, the risk of developing breast cancer at an age younger than 40 years in women in the United Kingdom can be estimated at approximately 2%. Also, studies on Ashkenazi Jews has indicated that a mutation in BRCA1 confers a relative risk of 21.6 of developing breast cancer in women under 40 years of age, and a mutation in BRCA2 confers a relative risk of 3.3 of developing breast cancer in women under 40 years of age. From these data, it may be estimated that a woman with a BRCA1 mutation would have a risk of approximately 40% of developing breast cancer at an age younger than 40 years, and woman with a BRCA2 mutation would have a risk of approximately 6%. However, in the rather improbable situation of having both a BRCA1 and a BRCA2 mutation, simply multiplying with both relative risks would result in a risk of over 140% of developing breast cancer before 40 years of age, which can not possibly be accurate in reality.
A method to compensate for both sources of inaccuracy above is to establish the relative risks by multivariate regression analysis. However, to retain its validity, relative risks established as such must be multiplied with all the other risk factors in the same regression analysis, and without any addition of other factors beyond the regression analysis.
In addition, multiplying multiple relative risks has the same risk of missing important overlaps of the included risk factors, similarly to when using likelihood ratios. Also, different risk factors can act in synergy, with the result that, for example, two factors that both individually have a relative risk of 2 have a total relative risk of 6 when both are present, or can inhibit each other, somewhat similarly to the interference described for using likelihood ratios.
By diagnostic criteria and clinical prediction rules
Most major diseases have established diagnostic criteria and/or clinical prediction rules. The establishment of diagnostic criteria or clinical prediction rules consists of a comprehensive evaluation of many tests that are considered important in estimating the probability of a condition of interest, sometimes also including how to divide it into subgroups, and when and how to treat the condition. Such establishment can include usage of predictive values, likelihood ratios as well as relative risks.For example, the ACR criteria for systemic lupus erythematosus defines the diagnosis as presence of at least 4 out of 11 findings, each of which can be regarded as a target value of a test with its own sensitivity and specificity. In this case, there has been evaluation of the tests for these target parameters when used in combination in regard to, for example, interference between them and overlap of target parameters, thereby striving to avoid inaccuracies that could otherwise arise if attempting to calculate the probability of the disease using likelihood ratios of the individual tests. Therefore, if diagnostic criteria have been established for a condition, it is generally most appropriate to interpret any post-test probability for that condition in the context of these criteria.
Also, there are risk assessment tools for estimating the combined risk of several risk factors, such as the online tool from the Framingham Heart Study for estimating the risk for coronary heart disease outcomes using multiple risk factors, including age, gender, blood lipids, blood pressure and smoking, being much more accurate than multiplying the individual relative risks of each risk factor.
Still, an experienced physician may estimate the post-test probability by a broad consideration including criteria and rules in addition to other methods described previously, including both individual risk factors and the performances of tests that have been carried out.
Clinical use of pre- and post-test probabilities
A clinically useful parameter is the absolute difference between pre- and post-test probability, calculated as:Absolute difference = Absolute value|
A major factor for such an absolute difference is the power of the test itself, such as can be described in terms of, for example, sensitivity and specificity or likelihood ratio. Another factor is the pre-test probability, with a lower pre-test probability resulting in a lower absolute difference, with the consequence that even very powerful tests achieve a low absolute difference for very unlikely conditions in an individual, but on the other hand, that even tests with low power can make a great difference for highly suspected conditions.
The probabilities in this sense may also need to be considered in context of conditions that are not primary targets of the test, such as profile-relative probabilities in a differential diagnostic procedure.
The absolute difference can be put in relation to the benefit for an individual that a medical test achieves, such as can roughly be estimated as:
, where:
- bn is the net benefit of performing a medical test
- Λp is the absolute difference between pre- and posttest probability of conditions that the test is expected to achieve.
- ri is the rate of how much probability differences are expected to result in changes in interventions.
- bi is the benefit of changes in interventions for the individual
- hi is the harm of changes in interventions for the individual, such as side effects of medical treatment
- ht is the harm caused by the test itself
Additional factors that influence a decision whether a medical test should be performed or not include: cost of the test, availability of additional tests, potential interference with subsequent test, time taken for the test or other practical or administrative aspects. Also, even if not beneficial for the individual being tested, the results may be useful for the establishment of statistics in order to improve health care for other individuals.