The histone code is a hypothesis that the transcription of genetic information encoded in DNA is in part regulated by chemical modifications to histone proteins, primarily on their unstructured ends. Together with similar modifications such as DNA methylation it is part of the epigenetic code. Histones associate with DNA to form nucleosomes, which themselves bundle to form chromatin fibers, which in turn make up the more familiar chromosome. Histones are globular proteins with a flexible N-terminus that protrudes from the nucleosome. Many of the histone tail modifications correlate very well to chromatin structure and both histone modification state and chromatin structure correlate well to gene expression levels. The critical concept of the histone code hypothesis is that the histone modifications serve to recruit other proteins by specific recognition of the modified histone via protein domains specialized for such purposes, rather than through simply stabilizing or destabilizing the interaction between histone and the underlying DNA. These recruited proteins then act to alter chromatin structure actively or to promote transcription. For details of gene expression regulation by histone modifications see table below.
The hypothesis
The hypothesis is that chromatin-DNA interactions are guided by combinations of histone modifications. While it is accepted that modifications to histone tails alter chromatin structure, a complete understanding of the precise mechanisms by which these alterations to histone tails influence DNA-histone interactions remains elusive. However, some specific examples have been worked out in detail. For example, phosphorylation of serine residues 10 and 28 on histone H3 is a marker for chromosomal condensation. Similarly, the combination of phosphorylation of serine residue 10 and acetylation of a lysine residue 14 on histone H3 is a tell-tale sign of active transcription.
Modifications
Well characterized modifications to histones include:
Methylation: Both lysine and arginine residues are known to be methylated. Methylated lysines are the best understood marks of the histone code, as specific methylated lysine match well with gene expression states. Methylation of lysines H3K4 and H3K36 is correlated with transcriptional activation while demethylation of H3K4 is correlated with silencing of the genomic region. Methylation of lysines H3K9 and H3K27 is correlated with transcriptional repression. Particularly, H3K9me3 is highly correlated with constitutive heterochromatin. Methylation of histone lysine also has a role in DNA repair. For instance, H3K36me3 is required for homologous recombinational repair of DNA double-strand breaks, and H4K20me2 facilitates repair of such breaks by non-homologous end joining.
Acetylation—by HAT ; deacetylation—by HDAC : Acetylation tends to define the 'openness' of chromatin as acetylated histones cannot pack as well together as deacetylated histones.
Phosphorylation
Ubiquitination
However, there are many more histone modifications, and sensitivemass spectrometry approaches have recently greatly expanded the catalog. A very basic summary of the histone code for gene expression status is given below :
Unlike this simplified model, any real histone code has the potential to be massively complex; each of the four standard histones can be simultaneously modified at multiple different sites with multiple different modifications. To give an idea of this complexity, histone H3 contains nineteen lysines known to be methylated—each can be un-, mono-, di- or tri-methylated. If modifications are independent, this allows a potential 419 or 280 billion different lysine methylation patterns, far more than the maximum number of histones in a human genome. And this does not include lysine acetylation, arginine methylation or threonine/serine/tyrosine phosphorylation, not to mention modifications of other histones. Every nucleosome in a cell can therefore have a different set of modifications, raising the question of whether common patterns of histone modifications exist. A study of about 40 histone modifications across human gene promoters found over 4000 different combinations used, over 3000 occurring at only a single promoter. However, patterns were discovered including a set of 17 histone modifications that are present together at over 3000 genes. Therefore, patterns of histone modifications do occur but they are very intricate, and we currently have detailed biochemical understanding of the importance of a relatively small number of modifications. Structural determinants of histone recognition by readers, writers and erasers of the histone code are revealed by a growing body of experimental data.