FusionSeq: A Modular Framework for Finding Gene Fusions by Analyzing Paired-End RNA-Sequencing Data
A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein. Yale University, New Haven; Weill Cornell Medical College (WCMC), New York; WCMC, New York, NY; Prognosys Biosciences, Inc., La Jolla; WCMC, New York; The Broad Institute of MIT and Harvard, Cambridge
Background: Next-generation sequencing can interrogate genomes and transcriptomes to elucidate driving molecular events, such as gene fusions. One of the newest sequencing technologies, Paired-End (PE) RNA-Sequencing, can detect novel transcript fusions missed by standard techniques.
Design: We developed FusionSeq (http://rnaseq.gersteinlab.org/fusionseq), a platform independent framework consisting of: 1) a mapping module, finding candidate fusions from PE reads joining two genes; 2) a PE-read filtering module, discarding candidates with aberrant insert-size compared to the transcriptome norm and other artifacts, and 3) a junction-sequence module, identifying the specific sequence at the breakpoints. Candidates are ranked by several statistics: SPER (Supportive-PE-reads-per-million mapped Reads), DASPER (Difference between the observed and Analytically calculated expected SPER) and RESPER (Ratio of Empirically computed SPERs), accounting for the number of PE reads supporting the fusion and the corresponding estimation of how “high” this number is.
Results: Results of top candidate fusions from 6 prostate cancers are presented in the table.