Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

Section

Similar to the concept of splicing graphs [Heber 2002], we employ a graph structure G=(V,E) for representing the reference transcriptome that is quantified in a non-redundant data structure. Each edge e=(tail,head,mode,T) represents a segment of an annotated pre-mRNA molecule by the genomic coordinate of the corresponding 3'-tail and 5'-head position, by the type (exonic or intronic), and by the set T of supporting transcripts (Lemma 1).

...

 

Lemma 1 (Segment Graph Properties): any two adjacent edges e and f in G are characterized by:

(i) do share the same intermediary splice site s (adjacency)

heade = tailf = s

...

UTinedges(s)= UToutedges(s)

(iii) either differ in mode or in supporting transcripts (discrimation)

(modeemodef) ν (Te Tf)

...

 

Figure 1: segment graph inferred on an alternatively spliced locus. (A) The exon-intron structure of a locus with two alternative transcripts. (B) Segment graph elements with links by exonic edges shown as solid arrows, links by intronic edges as dashed arrows, and source/sink links as dotted arroes. (C) Expansion of the segment graph by super-edges coalesced from adjacent exon segments or from splice junctions. (D) Super-edges formed by paired-end mappings within the bounds of the three windows marked, to keep (super-) edge combinations within graphical resolution bounds.

Section

To ensure the properties of G at the respective transcript edges, all transcription initiation sites are connected to an artificial source node, and all cleavage sites are connected to an artificial sink node [Sammeth 2008]. Once the segment graph G has been constructed for a locus, the edge set E describes the backbone of exonic segments and introns from the 3'-most transcription start to the 5'-most cleavage site, with additional introns, source and sink links that allow to navigate alternative transcripts (Fig.1, panel A and B).

 

Similar to the concept of splicing graphs [Heber 2002], we employ a graph structure G=(V,E) for representing the reference transcriptome that is quantified in a non-redundant data structure. Each edge e=(tail,head,mode,T) represents a segment of an annotated pre-mRNA molecule by the genomic coordinate of the corresponding 3'-tail and 5'-head position, by the type (exonic or intronic), and by the set T of supporting transcripts (Lemma 1).