Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Parameters

Parameter

Name

Default

Value

Parameter

Range

Description
RTRANSCRIPTIONtrue{true,false}switch reverse transcription on/off
RT_PRIMERRH{PDT,RH}chooses random (RH) or poly-dT primers for first strand synthesis
RT_MOTIF  sequence motif that
RT_MIN500>0the minimum stretch that is polymerized by the reverse transcriptase enzyme
RT_MAX5,500>0the maximum stretch that is polymerized by the reverse transcriptase enzyme (template-fidelity)

Algorithm

Section

Input: RNA polymers annotated by their start end end coordinates on the transcript sequence they originate from (LIB_FILE) and parameters for reverse transcription.

Section

Current sequencing technologies have to transform RNA into double-stranded DNA molecules before sequencing, either before or after fragmentation. For the first strand synthesis the Flux Simulator provides random priming

Section

This step simulates the reverse transcription (RT) and an optional fragmentation or filtering process which can influence the outcome of the run substantially. You may either choose a random priming strategy or poly-dT primers (RT_PRIMER), but you have to provide the maximum and minimum length of the expected RT products . According to the nature of primers and the template fidelity of the reverse transcriptase enzyme (RT_MIN , RTand RT_MAX) — which certainly are a function of the specific RT protocol (enzymes and timing) applied. If the RT protocol is POLY-DT primed, a single successful priming event on each RNA molecule is assumed — which probably does not reflect reality, but prevents from a statistical even spread of molecule loss. In the case of RANDOM priming, the number of successful priming events, i.e., primers that recruit a reverse transcriptase, for a certain RNA x    molecule is drawn from sampling poisson distribution with mean

(2)

μ=nlength(RNA x )  n i=1 length(RNA i )      

Eq.(2) compares length()   of the considered molecule RNA x    to the average length of the RNA molecules in the reaction. To prevent the loss, especially of shorter transcripts, that may incur due to molecule numbers that are lower than in real experiments, at least one successful priming event is enforced on every RNA molecule. The length of the generated RNA molecule then is determined by an uniform randomly distributed variable U=[0;1[  .

(3)

RT min +U(RT max RT min )  

In the case of multiple priming events are extended along the same RNA molecule, upstream primed first strand cDNA synthesis can displace further downstream bound primers. The chance of for a DNA-RNA hybrid to be displaced is most likely a function of its distance dist   to the closest upstream priming event. The FLUX SIMULATOR currently decides on the displacement of overlapping RT extensions in a Bernoulli trial. To be specific, downstream primers get displaced if for a uniformly drawn random variable U=[0;1[   holds

(4)

U>dist RT min   1    

, the algorithm determines start point and extension separately for first and for second strand synthesis. 

  1. During first strand synthesis, poly-dT primers induce priming events in the poly-A tail, whereas random primers provoke successful initiation events along the entire molecule, and anchored primers trigger exactly one priming event at the 3-end of the respective fragment. In sequencing protocols without sequence-specific biases, each priming event is assigned a random location uniformly sampled along the corresponding stretch. Optionally, start points of first strand synthesis are determined by importance sampling according to weights of an optional PWM capturing sequence-biases.
  2. The point where second strand synthesis initiates is simulated by the length of the first DNA strand, which can be between RT_MIN and RT_MAX nucleotides, but maximally the distance of the first strand synthesis priming event from the 5’-end of the RNA template. The point of priming second strand synthesis in the presence of sequence biases is drawn from a distribution according to the PWM capturing the bias, or from a uniform distribution otherwise.

In the case of multiple priming events with random primers, several enzymes concurrently transcribe parts of the RNA molecule, and collisions with downstream DNA-RNA hybrids are resolved by displacement according to a Bernoulli trial.

Section

Output: Reversely transcribed DNA molecules annotated by their start end end coordinates on the transcript sequence they originate from (LIB_FILE).