Satellite DNA


Satellite DNA consists of very large arrays of tandemly repeating, non-coding DNA. Satellite DNA is the main component of functional centromeres, and form the main structural constituent of heterochromatin.
The name "satellite DNA" refers to the phenomenon that repetitions of a short DNA sequence tend to produce a different frequency of the bases adenine, cytosine, guanine and thymine, and thus have a different density from bulk DNA such that they form a second or 'satellite' band when genomic DNA is separated on a density gradient.

Satellite DNA families in humans

Satellite DNA, together with minisatellite and microsatellite DNA, constitute the tandem repeats.
The major satellite DNA families in humans are called:
Satellite familySize of repeat unit Location in human chromosomes
α 170All chromosomes
β68Centromeres of chromosomes 1, 9, 13, 14, 15, 21, 22 and Y
Satellite 125-48Centromeres and other regions in heterochromatin of most chromosomes
Satellite 25Most chromosomes
Satellite 35Most chromosomes

Length

A repeated pattern can be between 1 base pair long to several thousand base pairs long, and the total size of a satellite DNA block can be several megabases without interruption. Long repeat units have been described containing domains of shorter repeated segments and mononucleotides, arranged in clusters of microsatellites, wherein differences among individual copies of the longer repeat units were clustered. Most satellite DNA is localized to the telomeric or the centromeric region of the chromosome. The nucleotide sequence of the repeats is fairly well conserved across species. However, variation in the length of the repeat is common. For example, minisatellite DNA is a short region of repeating elements with length >9 nucleotides. Whereas microsatellites in DNA sequences are considered to have a length of 1-8 nucleotides. The difference in how many of the repeats is present in the region is the basis for DNA fingerprinting.

Origin

Microsatellites are thought to have originated by polymerase slippage during DNA replication. This comes from the observation that microsatellite alleles usually are length polymorphic; specifically, the length differences observed between microsatellite alleles are generally multiples of the repeat unit length.

Pathology

Microsatellite expansion is often found in transcription units. Often the base pair repetition will disrupt proper protein synthesis, leading to diseases such as myotonic dystrophy.

Structure

Satellite DNA adopts higher-order three-dimensional structures in eukaryotic organisms. This was demonstrated in the land crab Gecarcinus lateralis, whose genome contains 3% of a GC-rich sequence consisting of repeats of a ~2100 base pair sequence called RU. The RU was arranged in long tandem arrays with approximately 16,000 copies per genome. Several RU sequences were cloned and sequenced to reveal conserved regions of conventional DNA sequences over stretches greater than 550 bp, interspersed with five "divergent domains" within each copy of RU.
Four divergent domains consisted of microsatellite repeats, biased in base composition, with purines on one strand and pyrimidines on the other. Some contained mononucleotide repeats of C:G base pairs approximately 20 bp in length. These strand-biased domains ranged in length from approximately 20 bp to greater than 250 bp. The most prevalent repeated sequences in the embedded microsatellite regions were CT:AG, CCT:AGG, and CCCT:AGGG. These repeating sequences were shown to adopt triple-stranded DNA structures under superhelical stress or at slightly acidic pH.
Between the strand-biased microsatellite repeats and C:G mononucleotide repeats, all sequence variations retained one or two base pairs with A interrupting the pyrimidine-rich strand and T interrupting the purine-rich strand. This sequence feature appeared between microsatellite repeats and C:G mononucleotides in all strand-biased domains sequenced. These interruptions in compositional bias adopted highly distorted conformations as shown by their response to nuclease enzymes, presumably due to steric effects of the larger purines protruding into the complementary strand of smaller pyridine rings. The sequence TTAA:TTAA was found in the longest such domain of RU, which produced the strongest of all responses to nucleases. That particular strand-biased divergent domain was subcloned and its altered helical structure was studied in greater detail.
A fifth divergent domain in the RU sequence was characterized by variations of a symmetrical DNA sequence motif of alternating purines and pyrimidines shown to adopt a left-handed Z-DNA/stem-loop structure under superhelical stress. The conserved symmetrical Z-DNA was abbreviated Z4Z5NZ15NZ5Z4, where Z represents alternating purine/pyrimidine sequences. Except for the Z15 sequence motif, Z-DNA sequences were variable among different copies of the RU while the alternating purine/pyrimidine symmetrical Z-DNA sequence motif was preserved. A stem-loop structure was centered in the Z15 element at the highly conserved palindromic sequence CGCACGTGCG:CGCACGTGCG and was flanked by extended palindromic Z-DNA sequences over a 35 bp region. Many RU variants showed deletions of at least 10 bp outside the
Z4Z5NZ15NZ5Z4 structural element, while others had additional Z-DNA sequences lengthening the alternating purine and pyrimidine domain to over 50 bp.
Elsewhere in the RU, additional tandem repeats of the CGCAC:GTGCG sequence motif were found inserted into the longest of the four strand biased pyrimidine:purine divergent domains studied in detail as discussed above.
One extended RU sequence was shown to have six tandem copies of a 142 bp amplified sequence motif inserted into a region bordered by inverted repeats where most copies contained just one AMPL sequence element. There were no nuclease-sensitive altered structures or significant sequence divergence in the relatively conventional AMPL sequence. A truncated RU sequence, 327 bp shorter than most clones, arose from a single base change leading to a second EcoRI restriction site in TRU.
Another crab, the hermit crab Pagurus policarus, was shown to have a family of AT-rich satellites with inverted repeat structures that comprised 30% of the entire genome. Another cryptic satellite from the same crab with the sequence CCTA:TAGG was found inserted into some of the palindromes.