[1906] FusionSeq: A Modular Framework for Finding Gene Fusions by Analyzing Paired-End RNA-Sequencing Data

A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein. Yale University, New Haven; Weill Cornell Medical College (WCMC), New York; WCMC, New York, NY; Prognosys Biosciences, Inc., La Jolla; WCMC, New York; The Broad Institute of MIT and Harvard, Cambridge

Background: Next-generation sequencing can interrogate genomes and transcriptomes to elucidate driving molecular events, such as gene fusions. One of the newest sequencing technologies, Paired-End (PE) RNA-Sequencing, can detect novel transcript fusions missed by standard techniques.
Design: We developed FusionSeq (http://rnaseq.gersteinlab.org/fusionseq), a platform independent framework consisting of: 1) a mapping module, finding candidate fusions from PE reads joining two genes; 2) a PE-read filtering module, discarding candidates with aberrant insert-size compared to the transcriptome norm and other artifacts, and 3) a junction-sequence module, identifying the specific sequence at the breakpoints. Candidates are ranked by several statistics: SPER (Supportive-PE-reads-per-million mapped Reads), DASPER (Difference between the observed and Analytically calculated expected SPER) and RESPER (Ratio of Empirically computed SPERs), accounting for the number of PE reads supporting the fusion and the corresponding estimation of how “high” this number is.
Results: Results of top candidate fusions from 6 prostate cancers are presented in the table.

Table: Fusion Candidates Based on Type and Ranked Score.
99inter-chromosomalNDRG1- ERG8.07.81.9

Conclusions: FusionSeq is able to identify known ERG rearrangements as well as their isoforms and has the statistical support to rank fusion candidates.
Category: Pan-genomic/Pan-proteomic Approaches to Diseases

Monday, March 22, 2010 1:00 PM

Poster Session II # 221, Monday Afternoon


Close Window