Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Parameters

Parameter

Name

Variable

Default

Value

Parameter

Range

Description
 
FRAG_NB_M
    
FRAG_NB_LAMBDA
    
FRAG_NB_THOLD
    

 

Algorithm
Section

Already early reports on results from RNA-Seq experiments based on nebulization observed reads accumulating at the 5’-end of transcripts and around the center, especially of shorter transcript forms (36). These observations coincide well with breakpoint distributions obtained by a theoretical model of mechanical breakage that considers molecules as rigid stiffs (37), in which breakpoints recursively accumulate around the midpoint of iteratively broken fragments.  According to this model, the average expected fragment size depends on the length of the nebulized DNA molecule: comparatively short molecules accumulate higher breaking probabilities during the time it takes to fragment the longer molecules in the transcript population.

In the light of these preliminary studies, we simulate nebulization by an iterative two-step process: first, a random orientation of the molecule in the shear field—i.e., the point (q) where the shearing stress is applied—is determined by random sampling under a Gaussian function centered at a molecule's midpoint. Subsequently, the breaking probability pb is deduced from the exponential:   

 

pb= 1- exp-((min(q, len-q)+ c)/ l)M, (1)

 

where len is the molecule length, l is a parameter that describes the limiting size below which molecules are very unlikely broken by the shearing field; M is a parameter describing the force of the shear field and determines the steepness of the slope in the resulting fragment size distribution; c finally is a constant that adjusts pb to be 0.5 for a size exactly between l (pb ®0) and 2l (pb ®1). In our model, a Bernoulli trial on pb determines whether a simulated break incurs at a given position. Recursive breaking continues until thermodynamics equilibrium as assumed by convergence of the fraction of breaks per time unit in the library falling below a given threshold (t=1%)

 

 

Next, an optional fragmentation process (nebulization, hydrolysis, adaptive focusing acoustics, …) may be comprised in the simulation. In general the simulation distinguishes 2 different mechanisms of RNA degradation, a mechanical/physical breaking process (PHYSICAL) and cleavage that is less dependant on physical properties (CHEMICAL). Choice of the fragmentation nature will influence the distribution of fragment lengths after the simulated fragmentation. Furthermore, also dependant on the adopted method, you should provide a realistic estimation of the maximum molecule length (FRAG_LAMBDA) that is not broken in the applied protocol. For instance, ~500nt cDNA molecules are known to be problematic to break with usual nebulization strategies. Naturally, this will mark about the upper limit of the fragment distribution yielded.

Simulation of fragmentation is an iterative process, where in each round a fragment is assigned a certain breaking probability, for PHYSICAL fragmentation

(5)

P b =1exp length(cDNA) FRAG_LAMBDA       

and for CHEMICAL fragmentation

(6)

P b =1(length(cDNA)λ) 2     

On the occurrences of breaks is decided in Bernoulli trials, the location of the respective breakpoint is normally distributed around the middle of the molecule (PHYSICAL), respectively uniformly distributed along the molecule (CHEMICAL). Finally, you specify whether the fragmentation step is carried out after or before the reverse transcription from RNA to DNA. Finally, in some protocols there is a step after RT and fragmentation that filters the generated cDNA fragments by size (FILTERING). If so, provide the minimum , (FILT_MIN) and the maximum (FILT_MAX) length of the fragments you want to retain for the sequencing.