Viral Insertion Site Discovery Using Next Generation Sequencing of Formalin Fixed Tissue
EJ Duncavage, JR Armstrong, VJ Magrini, N Becker, R Demeter, ER Marids, JD Pfeifer. University of Utah, Salt Lake City, UT; Washington University, Saint Louis, MO
Background: Many techniques exist for mapping viral integration sites. However, these methods require large intact stretches of DNA and none is ideally suited for DNA extracted from formalin-fixed paraffin-embedded (FFPE) tissue. We present data mapping Merkel cell polyomavirus (MCPyV) genome integration sites from FFPE tissue using a novel method that combines hybrid capture, Illumina sequencing, and bioinformatics.
Design: First, viral capture probes covering the entire 5.3kb MCPyV genome were constructed by designing 23 overlapping, 275 bp long, PCR products; biotin-labeled dCTP was incorporated into the amplicons during the PCR amplification (the PCR products were sequence verified and biotin incorporation was confirmed). Second, we identified four cases of MCC that harbored MCPyV as confirmed by PCR, and extracted genomic DNA from cores of FFPE tumor tissue; the DNA was analyzed for quality/quantity and modified for Illumina sequencing. Third, the genomic DNA was hybridized with the biotinylated capture probes at 71°C for 48 hours in the presence of Cot-1 DNA; streptavidan-labeled paramagnetic beads were then added to the hybridization mixture and the hybridized DNA 'pulled-down' by a magnet. Fourth, the captured tumor DNA was melted away from the bead-bound amplicons and sequenced on an Illumina GAII analyzer using 50bp, or 75bp paired-end reads. The resulting data were aligned to the MCPyV viral genome, and chimeric viral/human sequences representing viral integration sites identified.
Results: Viral integration sites were correctly identified in 3 of 3 cases sequenced with 75bp reads. A definitive insertion site could not be identified in cases sequenced with shorter non-paired reads. Coverage of the viral genome ranged from 4,000x to 36,000x, with a minimum viral sequence enrichment of 30,000 fold.
Conclusions: Viral integration sites can be identified from FFPE tissue even if the viral insertion sequence is unknown. However, 75bp or longer reads are required to produce chimeric sequences informative enough to identify integration sites. This methodology could easily be applied to elucidate other chimeric DNA sequences from FFPE, such as identification of translocations where one partner gene is unknown.
Category: Pan-genomic/Pan-proteomic Approaches to Diseases
Tuesday, March 23, 2010 8:15 AM
Platform Session: Section H 1, Tuesday Morning