Z-factor


The Z-factor is a measure of statistical effect size. It has been proposed for use in high-throughput screening to judge whether the response in a particular assay is large enough to warrant further attention.

Background

In high-throughput screens, experimenters often compare a large number of single measurements of unknown samples to positive and negative control samples. The particular choice of experimental conditions and measurements is called an assay. Large screens are expensive in time and resources. Therefore, prior to starting a large screen, smaller test screens are used to assess the quality of an assay, in an attempt to predict if it would be useful in a high-throughput setting. The Z-factor is an attempt to quantify the suitability of a particular assay for use in a full-scale, high-throughput screen.

Definition

The Z-factor is defined in terms of four parameters: the means and standard deviations of both the positive and negative controls. Given these values, the Z-factor is defined as:
In practice, the Z-factor is estimated from the sample means and sample standard deviations

Interpretation

The following interpretations for the Z-factor are taken from:
Z-factorInterpretation
1.0Ideal. Z-factors can never exceed 1.
between 0.5 and 1.0An excellent assay. Note that if, 0.5 is equivalent to a separation of 12 standard deviations between and.
between 0 and 0.5A marginal assay.
less than 0There is too much overlap between the positive and negative controls for the assay to be useful.

Note that by the standards of many types of experiments, a zero Z-factor would suggest a large effect size, rather than a borderline useless result as suggested above. For example, if σpn=1, then μp=6 and μn=0 gives a zero Z-factor. But for normally-distributed data with these parameters, the probability that the positive control value would be less than the negative control value is less than 1 in 105. Extreme conservatism is used in high throughput screening due to the large number of tests performed.

Limitations

The constant factor 3 in the definition of the Z-factor is motivated by the normal distribution, for which more than 99% of values occur within 3 standard deviations of the mean. If the data follow a strongly non-normal distribution, the reference points may be misleading. Another issue is that the usual estimates of the mean and standard deviation are not robust; accordingly many users in the high-throughput screening community prefer "Robust Z-prime". Extreme values in either the positive or negative controls can adversely affect the Z-factor, potentially leading to an apparently unfavorable Z-factor even when the assay would perform well in actual screening
In addition, the application of the single Z-factor-based criterion to two or more positive controls with different strengths in the same assay will lead to misleading results
. The absolute sign in the Z-factor makes it inconvenient to derive the statistical inference of Z-factor mathematically
. A recently proposed statistical parameter, strictly standardized mean difference, can address these issues
. One estimate of SSMD is robust to outliers.