Identification of Pathogens in Archival Tissues Using a High-Throughput Sequencing Approach, 3SEQ
Robert T Sweeney, Alayne L Brunner, Kelli D Montgomery, Shirley X Zhu, Christina Kong, Quynh Le, Robert B West. Stanford University School of Medicine, Stanford, CA
Background: The extent to which viruses and bacteria are related to chronic disease and neoplasia remains questionable. Next generation sequencing (NGS) offers a promising tool for identifying RNA and DNA from viruses and bacteria in human tissue. 3SEQ, a type of RNA-seq requiring only 3' ends, was recently described as an NGS method for gene expression profiling of archival pathology tissues (Beck AH et al, 2010).
Design: We performed 3SEQ and evaluated for candidate non-human genetic sequences in 193 formalin-fixed paraffin embedded (FFPE) samples from the pathology archives. The tissue samples included a wide range of neoplastic, non-neoplastic disease, and normal specimens. Sequences that did not map to the human genome, and passed a filtering step to remove low-complexity reads, were compared with 3752 viral genomes (virome) and 1016 bacterial genomes (bacteriome). Following alignment of the candidate non-human reads to the virome and bacteriome, peak calling was performed to identify regions enriched for sequence reads that likely represent microbial transcripts within the human tissue specimen.
Results: From the 193 FFPE samples, 2.9 billion 36-bp sequence reads were obtained using 3SEQ. Of these, 222 million candidate non-human reads were identified and compared to the virome and bacteriome. This analysis not only allowed us to identify viral and bacterial sequences in FFPE tissue samples, but also to characterize expressed transcripts from those genomes. For example, we observed the expression of three Epstein-Barr viral genes in 8 of the 9 nasopharyngeal carcinoma (NPC) samples and were able to quantify their expression across samples. PCR validation and Sanger sequencing was used to confirm the presence of the transcript with the most robust 3SEQ peak. Additional candidate viral and bacterial peaks from various diagnoses are now under investigation.
Conclusions: 3SEQ is a useful tool for exploring pathogen gene expression in a wide variety of human disease. In archival human pathology tissue, the 3SEQ method combined with the peak-calling algorithm increases sensitivity and scope for identifying transcript termini of pathogens within a landscape of incompletely annotated viral and bacterial genomes.
Monday, March 19, 2012 11:15 AM
Platform Session: Section G2, Monday Morning