LINE1


LINE1 are class I transposable elements in the DNA of some organisms and belong to the group of long interspersed nuclear elements. L1 comprise approximately 17% of the human genome. The majority of L1 in the human genome are inactive; however, about 80-100 have retained the ability to retrotranspose, with considerable variation between individuals. These active L1s can interrupt the genome through insertions, deletions, rearrangements, and copy number variations. L1 activity has contributed to the instability and evolution of genomes, and is tightly regulated in the germline by DNA methylation, histone modifications, and piRNA. L1s can further impact genome variation by mispairing and unequal crossing-over during meiosis due to its repetitive DNA sequences.
L1 gene products are also required by many nonautonomous Alu and SVA SINE retrotransposons. Mutations induced by L1 and its nonautonomous counterparts have been found to cause a variety of heritable and somatic diseases.
Human L1 has been reported to have transferred to the genome of the gonorrhea bacteria.

Structure

A typical L1 element is approximately 6,000 base pairs long and consists of two non-overlapping open reading frames which are flanked by untranslated regions and target site duplications. In humans, ORF2 is thought to be translated by an unconventional termination/reinitiation mechanism, while mouse L1s contain an internal ribosome entry site upstream of each ORF.

5' UTR

The 5' UTR of the L1 element contains a strong, internal RNA Polymerase II transcription promoter in sense
The 5' UTR of mouse L1s contain a variable number of GC-rich tandemly repeated monomers of around 200bp, followed by a short non-monomeric region.
Human 5’ UTRs are ~900bp in length and do not contain repeated motifs. All families of human L1s harbor in their most 5’ extremity a binding motif for the transcription factor YY1. Younger families have also two binding sites for SOX-family transcription factors, and both YY1 and SOX sites were shown to be required for human L1 transcription initiation and activation.
Both mouse and human 5’ UTRs contain as well a weak antisense promoter of unknown function.

ORF1

The first ORF encode a 500 amino acid - 40 kDa protein that lacks homology with any protein of known function. In vertebrates, it contains a conserved C-terminus domain and a highly variable coiled-coil N-terminus that mediates the formation of ORF1 trimeric complexes. ORF1 trimers have RNA-binding and nucleic acid chaperone activity that are necessary for retrotransposition.

ORF2

The second ORF of L1 encodes a protein that has endonuclease and reverse transcriptase activity. The encoded protein has a molecular weight of 150 kDA.

Roles in disease

Cancer

L1 activity has been observed in numerous types of cancers, with particularly extensive insertions found in colorectal and lung cancers. It is currently unclear if these insertions are causal or secondary effects of cancer progression. However, at least two cases have found somatic L1 insertions causative of cancer by disrupting the coding sequences of genes APC and PTEN in colon and endometrial cancer, respectively.
Quantification of L1 copy number by qPCR or L1 methylation levels with bisulfite sequencing are used as diagnostic biomarkers in some types of cancers. L1 hypomethylation of colon tumor samples is correlated with cancer stage progression. Furthermore, less invasive blood assays for L1 copy number or methylation levels are indicative of breast or bladder cancer progression and may serve as methods for early detection.

Neuropsychiatric disorders

Higher L1 copy numbers have been observed in the human brain compared to other organs. Studies of animal models and human cell lines have shown that L1s become active in neural progenitor cells, and that experimental deregulation of or overexpression of L1 increases somatic mosaicism. This phenomenon is negatively regulated by Sox2, which is downregulated in NPCs, and by MeCP2 and methylation of the L1 5' UTR. Human cell lines modeling the neurological disorder Rett syndrome, which carry MeCP2 mutations, exhibit increased L1 transposition, suggesting a link between L1 activity and neurological disorders. Current studies are aimed at investigating the potential roles of L1 activity in various neuropsychiatric disorders including schizophrenia, autism spectrum disorders, epilepsy, bipolar disorder, Tourette syndrome, and drug addiction.

Retinal disease

Increased RNA levels of Alu, which requires L1 proteins, are associated with a form of age-related macular degeneration, a neurological disorder of the eyes.
The naturally occurring mouse retinal degeneration model rd7 is caused by an L1 insertion in the Nr2e3 gene.