Menzerath's law


Menzerath's law, or Menzerath–Altmann law, is a linguistic law according to which the increase of the size of a linguistic construct results in a decrease of the size of its constituents, and vice versa.
E.g., the longer a sentence the shorter the clauses, or: the longer a word the shorter the syllables or words in sounds.
According to Altmann, it can be mathematically stated as:
where:
The law can be explained by the assumption that linguistic segments contain information about its structure. The assumption that the length of the structure information is independent of the length of the other content of the segment yields the alternative formula that was also successfully empirically tested.
Beyond quantitative linguistics, Menzerath's law can be discussed in any
multi-level complex systems. Given three levels, is the
number of middle-level units contained in a high-level unit,
is the averaged number of low-level units contained in middle-level units,
Menzerath's law claims a negative
correlation between and.
Menzerath's law is shown to be true for both the
base-exon-gene levels in the human genome,
and base-chromosome-genome levels in genomes from a collection of species. In addition, Menzerath's law was shown to accurately predict the distribution of protein lengths in terms of amino acid number in the proteome of ten organisms.