Entropic vector


The entropic vector or entropic function is a concept arising in information theory. It represents the possible values of Shannon's information entropy that subsets of one set of random variables may take. Understanding which vectors are entropic is a way to represent all possible inequalities between entropies of various subsets. For example, for any two random variables, their joint entropy is at most the sum of the entropies of and of :
Other information-theoretic measures such as conditional information, mutual information, or total correlation can be expressed in terms of joint entropy and are thus related by the corresponding inequalities.
Many inequalities satisfied by entropic vectors can be derived as linear combinations of a few basic ones, called Shannon-type inequalities.
However, it has been proven that already for variables, no finite set of linear inequalities is sufficient to characterize all entropic vectors.

Definition

's information entropy of a random variable is denoted.
For a tuple of random variables, we denote the joint entropy of a subset as, or more concisely as, where.
Here can be understood as the random variable representing the tuple.
For the empty subset, denotes a deterministic variable with entropy 0.
A vector h in indexed by subsets of is called an entropic vector of order if there exists a tuple of random variables such that for each subset.
The set of all entropic vectors of order is denoted by.
Zhang and Yeung proved that it is not closed, but its closure,, is a convex cone and hence characterized by the linear inequalities it satisfies.
Describing the region is thus equivalent to characterizing all possible inequalities on joint entropies.

Example

Let X,Y be two independent random variables with discrete uniform distribution over the set. Then
, and
The corresponding entropic vector is thus:
On the other hand, the vector is not entropic, because any pair of random variables should satisfy.

Characterizing entropic vectors: the region Γ''n''*

Shannon-type inequalities and Γ''n''

For a tuple of random variables, their entropies satisfy:
In particular,, for any .
The Shannon inequality says that an entropic vector is submodular:
It is equivalent to the inequality stating that the conditional mutual information is non-negative:
.
Many inequalities can be derived as linear combinations of Shannon inequalities; they are called Shannon-type inequalities or basic information inequalities of Shannon's information measures. The set of vectors that satisfies them is called ; it contains.
Software has been developed to automate the task of proving Shannon-type inequalities.
Given an inequality, such software is able to determine whether the given inequality is a valid Shannon-type inequality.

Non-Shannon-type inequalities

The question of whether Shannon-type inequalities are the only ones, that is, whether they completely characterize the region, was first asked by Te Su Han in 1981 and more precisely by Nicholas Pippenger in 1986.
It is not hard to show that this is true for two variables, that is,.
For three variables, Zhang and Yeung proved that ; however, it is still asymptotically true, meaning that the closure is equal:.
In 1998, Zhang and Yeung showed that for all, by proving that the following inequality on four random variables is true for any entropic vector, but is not Shannon-type:
Further inequalities and infinite families of inequalities have been found.
These inequalities provide outer bounds for better than the Shannon-type bound.
In 2007, Matus proved that no finite set of linear inequalities is sufficient, for variables. In other words, the region is not polyhedral.
Whether they can be characterized in some other way remains an open problem.
Analogous questions for von Neumann entropy in quantum information theory have been considered.

Inner bounds

Some inner bounds of are also known.
One example is that contains all vectors in which additionally satisfy the following inequality, known as Ingleton's inequality for entropy:

Entropy and groups

Group-characterizable vectors and quasi-uniform distributions

Consider a group and subgroups of.
Let denote for ; this is also a subgroup of.
It is possible to construct a probability distribution for random variables such that
. Thus any information-theoretic inequality implies a group-theoretic one. For example, the basic inequality implies that
It turns out the converse is essentially true.
More precisely, a vector is said to be group-characterizable if it can be obtained from a tuple of subgroups as above.
The set of group-characterizable vectors is denoted.
As said above, .
On the other hand, is contained in the topological closure of the convex closure of.
In other words, a linear inequality holds for all entropic vectors if and only if it holds for all vectors of the form, where goes over subsets of some tuple of subgroups in a group.
Group-characterizable vectors that come from an abelian group satisfy Ingleton's inequality.

Kolmogorov complexity

satisfies essentially the same inequalities as entropy.
Namely, denote the Kolmogorov complexity of a finite string as .
The joint complexity of two strings, defined as the complexity of an encoding of the pair, can be denoted.
Similarly, the conditional complexity can be denoted .
Andrey Kolmogorov noticed these notions behave similarly to Shannon entropy, for example:
In 2000, Hammer et al. proved that indeed an inequality holds for entropic vectors if and only if the corresponding inequality in terms of Kolmogorov complexity holds up to logarithmic terms for all tuples of strings.