All models are wrong


"All models are wrong" is a common aphorism in statistics; it is often expanded as "All models are wrong, but some are useful". It is usually considered to be applicable to not only statistical models, but to scientific models generally. The aphorism recognizes that statistical/scientific models always fall short of the complexities of reality but can still be of use.
The aphorism is generally attributed to the statistician George Box, although the underlying concept predates Box's writings.

Quotations of George Box

The first record of Box saying "all models are wrong" is in a 1976 paper published in the Journal of the American Statistical Association. The 1976 paper contains the aphorism twice. The two sections of the paper that contain the aphorism are copied below.
Box repeated the aphorism in a paper that was published in the proceedings of a 1978 statistics workshop. The paper contains a section entitled "All models are wrong but some are useful". The section is copied below.
Box repeated the aphorism twice more in his 1987 book, Empirical Model-Building and Response Surfaces. The first repetition is on p. 74: "Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful." The second repetition is on p. 424, which is excerpted below.
A second edition of the book was published in 2007, under the title Response Surfaces, Mixtures, and Ridge Analyses. The second edition also repeats the aphorism twice, in contexts identical with those of the first edition.
Box repeated the aphorism two more times in his 1997 book, Statistical Control: By Monitoring and Feedback Adjustment. The first repetition is on p. 6, which is excerpted below.
The second repetition is on p. 9: "So since all models are wrong, it is very important to know what to worry about;
or, to put it in another way, what models are likely to produce procedures that work in practice ".
A second edition of the book was published in 2009, under the title Statistical Control By Monitoring and Adjustment. The second edition also repeats the aphorism two times. The first repetition is on p. 61, which is excerpted below.
The second repetition is on p. 63; its context is essentially the same as that of the second repetition in the first edition.
Box's widely cited book Statistics for Experimenters does not include the aphorism in its first edition. The second edition includes the aphorism three times: on p. 208, p. 384, and p. 440. On p. 440, the relevant sentence is this: "The most that can be expected from any model is that it can supply a useful approximation to reality: All models are wrong; some models are useful".
In addition to stating the aphorism verbatim, Box sometimes stated the essence of the aphorism with different words. One example is from 1978, while Box was President of the American Statistical Association. At the annual meeting of the Association, Box delivered his Presidential Address, wherein he stated this: "Models, of course, are never true, but fortunately it is only necessary that they be useful".

Discussions

There have been varied discussions about the aphorism. A selection from those discussions is presented below.
In 1983, the statisticians Peter McCullagh and John Nelder published their much-cited book on generalized linear models. The book includes a brief discussion of the aphorism. A second edition of the book, published in 1989, contains a very similar discussion of the aphorism. The discussion from the first edition is as follows.
In 1995, the statistician Sir David Cox commented as follows.
In 1996, an Applied Statistician's Creed was proposed. The Creed includes, in its core part, the aphorism.
In 2002, K. P. Burnham and D. R. Anderson published their much-cited book on statistical model selection. The book states the following.
The statistician J. Michael Steele has commented on the aphorism as follows.
In 2008, the statistician Andrew Gelman responded to that, saying in particular the following.
In 2013, the philosopher of science Peter Truran published an essay related to the aphorism. The essay notes, in particular, the following.
Truran's essay further notes that Newton's theory of gravitation has been supplanted by Einstein's theory of relativity and yet Newton's theory remains generally "empirically adequate". Indeed, Newton's theory generally has excellent predictive power. Yet Newton's theory is not an approximation of Einstein's theory. For illustration, consider an apple falling down from a tree. Under Newton's theory, the apple falls because Earth exerts a force on the apple—what is called "the force of gravity". Under Einstein's theory, Earth does not exert any force on the apple. Hence, Newton's theory might be regarded as being, in some sense, completely wrong but extremely useful.
In 2014, the statistician David Hand made the following statement.
In 2016, P. J. Bickel and K. A. Doksum published the second volume of their book on mathematical statistics. The volume includes the quote from Box's Presidential Address, given above. It states that the quote is the best formulation of the "guiding principle of modern statistics".
Additionally, in 2011, a workshop on model selection was held in The Netherlands. The name of the workshop was "All models are wrong...".

Historical antecedents

Although the aphorism seems to have originated with George Box, the underlying concept goes back decades, perhaps centuries. Some exemplifications of that are given below.
In 1960, Georg Rasch said the following.
In 1947, the mathematician John von Neumann said that "truth … is much too complicated to allow anything but approximations".
In 1942, the French philosopher-poet Paul Valéry said the following.
In 1939, the founder of statistical process control, Walter Shewhart, said the following.
In 1923, a related idea was articulated by the artist Pablo Picasso.