Extrachromosomal DNA


Extrachromosomal DNA is any DNA that is found off the chromosomes, either inside or outside the nucleus of a cell. Most DNA in an individual genome is found in chromosomes contained in the nucleus. Multiple forms of extrachromosomal DNA exist and serve important biological functions, e.g. they can play a role in disease, such as ecDNA in cancer.
In prokaryotes, nonviral extrachromosomal DNA are primarily found in plasmids whereas in eukaryotes extrachromosomal DNA are primarily found in organelles. Mitochondrial DNA are a main source of this extrachromosomal DNA in eukaryotes. The fact that this organelle contains its own DNA supports the hypothesis that mitochondria originated as bacterial cells engulfed by ancestral eukaryotic cells. Extrachromosomal DNA are often used in research of replication because they are easy to identify and isolate.
Although extrachromosomal circular DNA are found in normal eukaryotic cells, extrachromosomal DNA are a distinct entity that have been identified in the nuclei of cancer cells and have been shown to carry many copies of driver oncogenes. ecDNA are considered to be a primary mechanism of gene amplification, resulting in many copies of driver oncogenes and very aggressive cancers.  
Extrachromosomal DNA in the cytoplasm have been found to be structurally different from nuclear DNA. Cytoplasmic DNA are less methylated than DNA found within the nucleus. It was also confirmed that the sequences of cytoplasmic DNA were different from nuclear DNA in the same organism, showing that cytoplasmic DNAs are not simply fragments of nuclear DNA. In cancer cells, ecDNA have been shown to be primarily isolated to the nucleus.
In addition to DNA found outside the nucleus in cells, infection of viral genomes also provides an example of extrachromosomal DNA.

Prokaryotic

Although prokaryotic organisms do not possess a membrane bound nucleus like the eukaryotes, they do contain a nucleoid region in which the main chromosome is found. Extrachromosomal DNA exists in prokaryotes outside the nucleoid region as circular or linear plasmids. Bacterial plasmids are typically short sequences, consisting of 1 kilobase to a few hundred kb segments, and contain an origin of replication which allows the plasmid to replicate independently of the bacterial chromosome. The total number of a particular plasmid within a cell is referred to as the copy number and can range from as few as two copies per cell to as many as several hundred copies per cell. Circular bacterial plasmids are classified according to the special functions that the genes encoded on the plasmid provide. Fertility plasmids, or f plasmids, allow for conjugation to occur whereas resistance plasmids, or r plasmids, contain genes that convey resistance to a variety of different antibiotics such as ampicillin and tetracycline. There also exist virulence plasmids that contain the genetic elements necessary for bacteria to become pathogenic as well as degradative plasmids that harbor the genes that allow bacteria to degrade a variety of substances such as aromatic compounds and xenobiotics. Bacterial plasmids can also function in pigment production, nitrogen fixation and the resistance to heavy metals in those bacteria that possess them.
Naturally occurring circular plasmids can be modified to contain multiple resistance genes and several unique restriction sites, making them valuable tools as cloning vectors in biotechnology applications. Circular bacterial plasmids are also the basis for the production of DNA vaccines. Plasmid DNA vaccines are genetically engineered to contain a gene which encodes for an antigen or a protein produced by a pathogenic virus, bacterium or other parasites. Once delivered into the host, the products of the plasmid genes will then stimulate both the innate immune response and the adaptive immune response of the host. The plasmids are often coated with some type of adjuvant prior to delivery to enhance the immune response from the host.
Linear bacterial plasmids have been identified in several species of spirochete bacteria, including members of the genus Borrelia, several species of the gram positive soil bacteria of the genus Streptomyces, and in the gram negative species Thiobacillus versutus, a bacterium that oxidizes sulfur. The linear plasmids of prokaryotes are found either containing a hairpin loop or a covalently bonded protein attached to the telomeric ends of the DNA molecule. The adenine-thymine rich hairpin loops of the Borrelia bacteria range in size from 5 kilobase pairs to over 200 kb and contain the genes responsible for producing a group of major surface proteins, or antigens, on the bacteria that allow it to evade the immune response of its infected host. The linear plasmids which contain a protein that has been covalently attached to the 5’ end of the DNA strands are known as invertrons and can range in size from 9 kb to over 600 kb consisting of inverted terminal repeats. The linear plasmids with a covalently attached protein may assist with bacterial conjugation and integration of the plasmids into the genome. These types of linear plasmids represent the largest class of extrachromosomal DNA as they are not only present in certain bacterial cells, but all linear extrachromosomal DNA molecules found in eukaryotic cells also take on this invertron structure with a protein attached to the 5’ end.

Eukaryotic

Mitochondrial

The mitochondria present in eukaryotic cells contain multiple copies of mitochondrial DNA referred to as mtDNA which is housed within the mitochondrial matrix. In multicellular animals, including humans, the circular mtDNA chromosome contains 13 genes that encode proteins that are part of the electron transport chain and 24 genes that produce RNA necessary for the production of mitochondrial proteins; these genes are broken down into 2 rRNA genes and 22 tRNA genes. The size of an animal mtDNA plasmid is roughly 16.6 kb and although it contains genes for tRNA and mRNA synthesis, proteins produced as a result of nuclear genes are still required in order for the mtDNA to replicate or for mitochondrial proteins to be translated. There is only one region of the mitochondrial chromosome that does not contain a coding sequence and that is the 1 kb region known as the D-loop to which nuclear regulatory proteins bind. The number of mtDNA molecules per mitochondria varies from species to species as well as between cells with different energy demands. For example, muscle and liver cells contain more copies of mtDNA per mitochondrion than blood and skin cells do. Due to the proximity of the electron transport chain within the mitochondrial inner membrane and the production of reactive oxygen species, and due to the fact that the mtDNA molecule is not bound by or protected by histones, the mtDNA is more susceptible to DNA damage than nuclear DNA. In cases where mtDNA damage does occur, the DNA can either be repaired via base excision repair pathways, or the damaged mtDNA molecule is destroyed.
The standard genetic code by which nuclear genes are translated is universal, meaning that each 3-base sequence of DNA codes for the same amino acid regardless of what species from which the DNA comes. However, this universal nature of the code is not the case with mitochondrial DNA found in fungi, animals, protists and plants. While most of the 3-base sequences in the mtDNA of these organisms do code for the same amino acids as those of the nuclear genetic code, there are some mtDNA sequences that code for amino acids different from those of their nuclear DNA counterparts.
Genetic codeTranslation tableDNA codon involvedRNA codon involvedTranslation with this codeComparison with the universal code
Vertebrate mitochondrial2AGAAGATer Arg
Vertebrate mitochondrial2AGGAGGTer Arg
Vertebrate mitochondrial2ATAAUAMet Ile
Vertebrate mitochondrial2TGAUGATrp Ter
Yeast mitochondrial3ATAAUAMet Ile
Yeast mitochondrial3CTTCUUThr Leu
Yeast mitochondrial3CTCCUCThr Leu
Yeast mitochondrial3CTACUAThr Leu
Yeast mitochondrial3CTGCUGThr Leu
Yeast mitochondrial3TGAUGATrp Ter
Yeast mitochondrial3CGACGAabsentArg
Yeast mitochondrial3CGCCGCabsentArg
Mold, protozoan, and coelenterate mitochondrial4 and 7TGAUGATrp Ter
Invertebrate mitochondrial5AGAAGASer Arg
Invertebrate mitochondrial5AGGAGGSer Arg
Invertebrate mitochondrial5ATAAUAMet Ile
Invertebrate mitochondrial5TGAUGATrp Ter
Echinoderm and flatworm mitochondrial9AAAAAAAsn Lys
Echinoderm and flatworm mitochondrial9AGAAGASer Arg
Echinoderm and flatworm mitochondrial9AGGAGGSer Arg
Echinoderm and flatworm mitochondrial9TGAUGATrp Ter
Ascidian mitochondrial13AGAAGAGly Arg
Ascidian mitochondrial13AGGAGGGly Arg
Ascidian mitochondrial13ATAAUAMet Ile
Ascidian mitochondrial13TGAUGATrp Ter
Alternative flatworm mitochondrial14AAAAAAAsn Lys
Alternative flatworm mitochondrial14AGAAGASer Arg
Alternative flatworm mitochondrial14AGGAGGSer Arg
Alternative flatworm mitochondrial14TAAUAATyr Ter
Alternative flatworm mitochondrial14TGAUGATrp Ter
Chlorophycean mitochondrial16TAGUAGLeu Ter
Trematode mitochondrial21TGAUGATrp Ter
Trematode mitochondrial21ATAAUAMet Ile
Trematode mitochondrial21AGAAGASer Arg
Trematode mitochondrial21AGGAGGSer Arg
Trematode mitochondrial21AAAAAAAsn Lys
Scenedesmus obliquus mitochondrial22TCAUCATer Ser
Scenedesmus obliquus mitochondrial22TAGUAGLeu Ter
Thraustochytrium mitochondrial23TTAUUATer Leu
Pterobranchia mitochondrial24AGAAGASer Arg
Pterobranchia mitochondrial24AGGAGGLys Arg
Pterobranchia mitochondrial24TGAUGATrp Ter

The coding differences are thought to be a result of chemical modifications in the transfer RNAs that interact with the messenger RNAs produced as a result of transcribing the mtDNA sequences.

Chloroplast

Eukaryotic chloroplasts, as well as the other plant plastids, also contain extrachromosomal DNA molecules. Most chloroplasts house all of their genetic material in a single ringed chromosome, however in some species there is evidence of multiple smaller ringed plasmids. A recent theory that questions the current standard model of ring shaped chloroplast DNA, suggests that cpDNA may more commonly take a linear shape. A single molecule of cpDNA can contain anywhere from 100-200 genes and varies in size from species to species. The size of cpDNA in higher plants is around 120–160 kb. The genes found on the cpDNA code for mRNAs that are responsible for producing necessary components of the photosynthetic pathway as well as coding for tRNAs, rRNAs, RNA polymerase subunits, and ribosomal protein subunits. Like mtDNA, cpDNA is not fully autonomous and relies upon nuclear gene products for replication and production of chloroplast proteins. Chloroplasts contain multiple copies of cpDNA and the number can vary not only from species to species or cell type to cell type, but also within a single cell depending upon the age and stage of development of the cell. For example, cpDNA content in the chloroplasts of young cells, during the early stages of development where the chloroplasts are in the form of indistinct proplastids, are much higher than those present when that cell matures and expands, containing fully mature plastids.

Circular

Extrachromosomal circular DNA are present in all eukaryotic cells, are usually derived from genomic DNA, and consist of repetitive sequences of DNA found in both coding and non-coding regions of chromosomes. EccDNA can vary in size from less than 2000 base pairs to more than 20,000 base pairs. In plants, eccDNA contain repeated sequences similar to those that are found in the centromeric regions of the chromosomes and in repetitive satellite DNA. In animals, eccDNA molecules have been shown to contain repetitive sequences that are seen in satellite DNA, 5S ribosomal DNA and telomere DNA. Certain organisms, such as yeast, rely on chromosomal DNA replication to produce eccDNA whereas eccDNA formation can occur in other organisms, such as mammals, independently of the replication process. The function of eccDNA have not been widely studied, but it has been proposed that the production of eccDNA elements from genomic DNA sequences add to the plasticity of the eukaryotic genome and can influence genome stability, cell aging and the evolution of chromosomes.
A distinct type of extrachromosomal DNA, denoted as ecDNA, is commonly observed in human cancer cells. ecDNA found in cancer cells contain one or more genes that confer a selective advantage. ecDNA are much larger than eccDNA, and are visible by light microscopy. ecDNA in cancers generally range in size from 1-3 MB and beyond. Large ecDNA molecules have been found in the nuclei of human cancer cells and are shown to carry many copies of driver oncogenes, which are transcribed in tumor cells. Based on this evidence it is thought that ecDNA contributes to cancer growth.

Viral

Viral DNA are an example of extrachromosomal DNA. Understanding viral genomes is very important for understanding the evolution and mutation of the virus. Some viruses, such as HIV and oncogenetic viruses, incorporate their own DNA into the genome of the host cell. Viral genomes can be made up of single stranded DNA, double stranded DNA and can be found in both linear and circular form.
One example of infection of a virus constituting as extrachromosomal DNA is the human papillomavirus. The HPV DNA genome undergoes three distinct stages of replication: establishment, maintenance and amplification. HPV infects epithelial cells in the anogenital tract and oral cavity. Normally, HPV is detected and cleared by the immune system. The recognition of viral DNA is an important part of immune responses. For this virus to persist, the circular genome must be replicated and inherited during cell division.

Recognition by host cell

Cells can recognize foreign cytoplasmic DNA. Understanding the recognition pathways has implications towards prevention and treatment of diseases. Cells have sensors that can specifically recognize viral DNA such as the Toll-like receptor pathway.
The Toll Pathway was recognized, first in insects, as a pathway that allows certain cell types to act as sensors capable of detecting a variety of bacterial or viral genomes and PAMPS. PAMPs are known to be potent activators of innate immune signaling. There are approximately 10 human Toll-Like Receptors. Different TLRs in human detect different PAMPS: lipopolysaccharides by TLR4, viral dsRNA by TLR3, viral ssRNA by TLR7/TLR8, viral or bacterial unmethylated DNA by TLR9. TLR9 has evolved to detect CpG DNA commonly found in bacteria and viruses and to initiate the production of IFN and other cytokines.

Inheritance

of extrachromosomal DNA differs from the inheritance of nuclear DNA found in chromosomes. Unlike chromosomes, ecDNA does not contain centromeres and therefore exhibits a non-Mendelian inheritance pattern that gives rise to heterogeneous cell populations.In humans, virtually all of the cytoplasm is inherited from the egg of the mother. For this reason, organelle DNA, including mtDNA, is inherited from the mother. Mutations in mtDNA or other cytoplasmic DNA will also be inherited from the mother. This uniparental inheritance is an example of non-Mendelian inheritance. Plants also show uniparental mtDNA inheritance. Most plants inherit mtDNA maternally with one noted exception being the redwood Sequoia sempervirens that inherit mtDNA paternally.
There are two theories why the paternal mtDNA is not transmitted to the offspring. One is simply the fact that paternal mtDNA is at such a lower concentration than the maternal mtDNA and thus it is not detectable in the offspring. A second, more complex theory, involves the digestion of the paternal mtDNA to prevent its inheritance. It is theorized that the uniparental inheritance of mtDNA, which has a high mutation rate, might be a mechanism to maintain the homoplasmy of cytoplasmic DNA.

Clinical significance

Sometimes called EEs, extrachromosomal elements, have been associated with genomic instability in eukaryotes. Small polydispersed DNAs, a type of eccDNA, are commonly found in conjunction with genome instability. SpcDNAs are derived from repetitive sequences such as satellite DNA, retrovirus-like DNA elements, and transposable elements in the genome. They are thought to be the products of gene rearrangements.
Extrachromosomal DNA found in cancer have historically been referred to as Double minute chromosomes, which present as paired chromatin bodies under light microscopy. Double minute chromosomes represent ~30% of the cancer-containing spectrum of ecDNA, including single bodies and have been found to contain identical gene content as single bodies. The ecDNA notation encompasses all forms of the large, oncogene-containing, extrachromosomal DNA found in cancer cells.  This type of ecDNA is commonly seen in cancer cells of various histologies, but virtually never in normal cells. ecDNA are thought to be produced through double-strand breaks in chromosomes or over-replication of DNA in an organism. Studies show that in cases of cancer and other genomic instability, higher levels of EEs can be observed.
Mitochondrial DNA can play a role in the onset of disease in a variety of ways. Point mutations in or alternative gene arrangements of mtDNA have been linked to several diseases that affect the heart, central nervous system, endocrine system, gastrointestinal tract, eye, and kidney. Loss of the amount of mtDNA present in the mitochondria can lead to a whole subset of diseases known as mitochondrial depletion syndromes which affect the liver, central and peripheral nervous systems, smooth muscle and hearing in humans. There have been mixed, and sometimes conflicting, results in studies that attempt to link mtDNA copy number to the risk of developing certain cancers. Studies have been conducted that show an association between both increased and decreased mtDNA levels and the increased risk of developing breast cancer. A positive association between increased mtDNA levels and an increased risk for developing kidney tumors has been observed but there does not appear to be a link between mtDNA levels and the development of stomach cancer.
Extrachromosomal DNA is found in Apicomplexa, which is a group of protozoa. The malaria parasite, the AIDS-related pathogen are both members of the Apicomplexa group. Mitochondrial DNA was found in the malaria parasite. There are two forms of extrachromosomal DNA found in the malaria parasites. One of these is 6-kb linear DNA and the second is 35-kb circular DNA. These DNA molecules have been researched as potential nucleotide target sites for antibiotics.

Role of ecDNA in cancer

is among the most common mechanisms of oncogene activation. One of the primary functions of ecDNA in cancer is to enable the tumor to rapidly reach high copy numbers, while also promoting rapid, massive cell-to-cell genetic heterogeneity. The most commonly amplified oncogenes in cancer are found on ecDNA and have been shown to be highly dynamic, re-integrating into non-native chromosomes as homogeneous staining regions and altering copy numbers and composition in response to various drug treatments.
The circular shape of ecDNA differs from the linear structure of chromosomal DNA in meaningful ways that influence cancer pathogenesis. Oncogenes encoded on ecDNA have massive transcriptional output, ranking in the top 1% of genes in the entire transcriptome.  In contrast to bacterial plasmids or mitochondrial DNA, ecDNA are chromatinized, containing high levels of active histone marks, but a paucity of repressive histone marks. The ecDNA chromatin architecture lacks the higher-order compaction that is present on chromosomal DNA and is among the most accessible DNA in the entire cancer genome.