Maximal information coefficient

In statistics, the maximal information coefficient is a measure of the strength of the linear or non-linear association between two variables X and Y.
The MIC belongs to the maximal information-based nonparametric exploration class of statistics. In a simulation study, MIC outperformed some selected low power tests, however concerns have been raised regarding reduced statistical power in detecting some associations in settings with low sample size when compared to powerful methods such as distance correlation and Heller–Heller–Gorfine. Comparisons with these methods, in which MIC was outperformed, were made in Simon and Tibshirani and in Gorfine, Heller, and Heller. It is claimed that MIC approximately satisfies a property called equitability which is illustrated by selected simulation studies. It was later proved that no non-trivial coefficient can exactly satisfy the equitability property as defined by Reshef et al., although this result has been challenged. Some criticisms of MIC are addressed by Reshef et al. in further studies published on arXiv.

Overview

The maximal information coefficient uses binning as a means to apply mutual information on continuous random variables. Binning has been used for some time as a way of applying mutual information to continuous distributions; what MIC contributes in addition is a methodology for selecting the number of bins and picking a maximum over many possible grids.
The rationale is that the bins for both variables should be chosen in such a way that the mutual information between the variables be maximal. That is achieved whenever. Thus, when the mutual information is maximal over a binning of the data, we should expect that the following two properties hold, as much as made possible by the own nature of the data. First, the bins would have roughly the same size, because the entropies and are maximized by equal-sized binning. And second, each bin of X will roughly correspond to a bin in Y.
Because the variables X and Y are reals, it is almost always possible to create exactly one bin for each datapoint, and that would yield a very high value of the MI. To avoid forming this kind of trivial partitioning, the authors of the paper propose taking a number of bins for X and whose product is relatively small compared with the size N of the data sample. Concretely, they propose:
In some cases it is possible to achieve a good correspondence between and with numbers as low as and, while in other cases the number of bins required may be higher. The maximum for is determined by H, which is in turn determined by the number of bins in each axis, therefore, the mutual information value will be dependent on the number of bins selected for each variable. In order to compare mutual information values obtained with partitions of different sizes, the mutual information value is normalized by dividing by the maximum achieveable value for the given partition size. It is worth noting that a similar adaptive binning procedure for estimating mutual information had been proposed previously.
Entropy is maximized by uniform probability distributions, or in this case, bins with the same number of elements. Also, joint entropy is minimized by having a one-to-one correspondence between bins. If we substitute such values in the formula
, we can see that the maximum value achieveable by the MI for a given pair of bin counts is. Thus, this value is used as a normalizing divisor for each pair of bin counts.
Last, the normalized maximal mutual information value for different combinations of and is tabulated, and the maximum value in the table selected as the value of the statistic.
It is important to note that trying all possible binning schemes that satisfy is computationally unfeasible even for small n. Therefore, in practice the authors apply a heuristic which may or may not find the true maximum.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...