In this report, we assessed the enzymatic reactions utilized in construction of amplification-free human DNA libraries for the Illumina sequencing platform (Supplementary Fig. The identification of the key contributory factors responsible for generating sequence bias in these enzymatic steps is challenging, as multiple DNA modifying enzymes and a broad range of substrates are involved. This end repair step is followed by 3′ A-tailing at 37 ☌ using a mesophilic polymerase such as Klenow Fragment 3′-5′ exonuclease minus 11, or at elevated temperatures using a thermophilic polymerase such as Taq DNA polymerase (Taq DNA pol) 12, 13. For streamlined protocols, once the sample DNA has been randomly sheared, the fragment ends are repaired by blunting and 5′ phosphorylation with a mixture of enzymes, such as T4 polynucleotide kinase (PNK) and T4 DNA polymerase (T4 DNA pol). It is of interest to investigate whether a systematic bias against AT-rich sequences is introduced during library preparation.Ī typical protocol of amplification-free library preparation for the Illumina platform comprises fragmentation, end repair (blunting and 5′ phosphorylation), 3′ A-tailing and adaptor ligation 11. Thus, bias against the AT-rich regions could be introduced prior to the amplification step or by sequencing chemistry. In addition, under-representation of AT-rich regions can only be slightly improved by avoiding library amplification 6, 7. Interestingly, depletion of the high GC content regions, but not the high AT content regions, can largely be prevented by optimization of PCR conditions 6. It is widely accepted that library amplification via polymerase chain reaction (PCR) introduces bias in sequencing coverage due to uneven amplification of sequences with different GC content by DNA polymerases 6, 10. Various sample treatment steps, including library construction, amplification and the sequencing chemistry itself can introduce GC bias 9. Investigation of the technical and methodological sources of this GC content associated bias is critical to developing solutions to improve library quality and data analysis 8. For example, some genomic sequences are over-represented whereas other regions have little or no coverage. However, biases found in current methods of NGS library preparation can produce uneven coverage, compromising the quality of NGS analysis 6, 7. NGS analysis relies on preparation of a representative, non-biased library (a pool of DNA or RNA molecules) evenly distributed across the entire genome (or region). The large-scale parallel sequencing techniques permit genome wide analysis of disease development, prognosis, and drug response 3, 4, 5. Next-generation sequencing (NGS) has revolutionized both biology and medical diagnosis 1, 2.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |