Delta method


In statistics, the delta method is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.

History

The delta method was derived from propagation of error, and the idea behind was known in the early 19th century. Its statistical application can be traced as far back as 1928 by T. L. Kelley. A formal description of the method was presented by J. L. Doob in 1935. Robert Dorfman also described a version of it in 1938.

Univariate delta method

While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, if there is a sequence of random variables satisfying
where θ and σ2 are finite valued constants and denotes convergence in distribution, then
for any function g satisfying the property that exists and is non-zero valued.

Proof in the univariate case

Demonstration of this result is fairly straightforward under the assumption that is continuous. To begin, we use the mean value theorem :
where lies between and θ.
Note that since and, it must be that and since is continuous, applying the continuous mapping theorem yields
where denotes convergence in probability.
Rearranging the terms and multiplying by gives
Since
by assumption, it follows immediately from appeal to Slutsky's theorem that
This concludes the proof.

Proof with an explicit order of approximation

Alternatively, one can add one more step at the end, to obtain the order of approximation:
This suggests that the error in the approximation converges to 0 in probability.

Multivariate delta method

By definition, a consistent estimator B converges in probability to its true value β, and often a central limit theorem can be applied to obtain asymptotic normality:
where n is the number of observations and Σ is a covariance matrix. Suppose we want to estimate the variance of a scalar-valued function h of the estimator B. Keeping only the first two terms of the Taylor series, and using vector notation for the gradient, we can estimate h as
which implies the variance of h is approximately
One can use the mean value theorem to see that this does not rely on taking first order approximation.
The delta method therefore implies that
or in univariate terms,

Example: the binomial proportion

Suppose Xn is binomial with parameters and n. Since
we can apply the Delta method with to see
Hence, even though for any finite n, the variance of does not actually exist, the asymptotic variance of does exist and is equal to
Note that since p>0, as, so with probability converging to one, is finite for large n.
Moreover, if and are estimates of different group rates from independent samples of sizes n and m respectively, then the logarithm of the estimated relative risk has asymptotic variance equal to
This is useful to construct a hypothesis test or to make a confidence interval for the relative risk.

Alternative form

The delta method is often used in a form that is essentially identical to that above, but without the assumption that or B is asymptotically normal. Often the only context is that the variance is "small". The results then just give approximations to the means and covariances of the transformed quantities. For example, the formulae presented in Klein are:
where is the rth element of h and Bi is the ith element of B.

Second-order delta method

When the delta method cannot be applied. However, if exists and is not zero, the second-order delta method can be applied. By the Taylor expansion,, so that the variance of relies on up to the 4th moment of.
The second-order delta method is also useful in conducting a more accurate approximation of 's distribution when sample size is small.
For example, when follows the standard normal distribution, can be approximated as the weighted sum of a standard normal and a chi-square with degree-of-freedom of 1.