Read (biology)


In DNA sequencing, a read is an inferred sequence of base pairs corresponding to all or part of a single DNA fragment. A typical sequencing experiment involves fragmentation of the genome into millions of molecules, which are size-selected and ligated to adapters. The set of fragments is referred to as a sequencing library, which is sequenced to produce a set of reads.

Read length

Sequencing technologies vary in the length of reads produced. Reads of length 20-40 base pairs are referred to as ultra-short. Typical sequencers produce read lengths in the range of 100-500 bp. However, Pacific Biosciences platforms produce read lengths of approximately 1500 bp. Read length is a factor which can affect the results of biological studies. For example, longer read lengths improve the resolution of de novo genome assembly and detection of structural variants. It is estimated that read lengths greater than 100 kilobases will be required for routine de novo human genome assembly. Bioinformatic pipelines to analyze sequencing data usually take into account read lengths.