View Source

Background

The input for the FLUX CAPACITOR is the annotation of a reference transcriptome and reads fromRNAseq technologies aligned to the genome. From the reference annotation, splicing graphs are produced and reads are mapped to corresponding edges in these graphs according to the position where they align in the genomic sequence. The resulting graph with edges labelled by the number of reads can be interpreted as a flow network where each transcript representing a transportation path from its start to its end and consequently each edge a possibly shared segment of transportation along which a certain number of reads per nucleotide -- i.e., a flux -- is observed. Given a density function of reads along a transcript, the expected participation of each transcript in an edge under consideration can be estimated. The basic idea is to cast back from these latter participations and the observed number of reads - allowing for a certain amount of noise - to the original transcript abundancies. To do so, a linear constraint is formalized for each edge, and an optimal solution for the complete set of constraints is found by a standard linear program solver.

<object width="480" height="400"><param name="movie" value="http://www.scivee.tv/flash/embedCast.swf" /><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="flashvars" value="id=10013&type=4" /><param name="wmode" value="transparent" /><embed src="http://www.scivee.tv/flash/embedCast.swf" allowfullscreen="true" wmode="transparent" allowscriptaccess="always" width="480" height="400" flashvars="id=10013&type=4&start=10"></embed></object>

The basic problem addressed by the FLUX CAPACITOR.

The exonic structure of two spliceforms (labeled as "SF A" and "SF B") is shown, with aligned reads from by RNAseq methods (top) . Those reads mapped to the edges of a splicing graph (bottom) represent a signal, measured as the FLUX - the relative coverage along an exonic stretch. Where transcripts overlap in exons, their respective flux is combined. Given the information from all edges in a locus, signal separation is achieved by decomposition across a flow network.

The video on the left shows an early report on our deconvolution strategy presented on the Genome Informatics conference 2009.

Citation

Transcriptome genetics using second generation sequencing in a Caucasian population. Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET. Nature. 2010 Apr 1;464(7289):773-7. Epub 2010 Mar 10. PMID: 20220756