You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

Parameters

NameVariable

Default

Value

Parameter

Range

Description
NB_MOLECULES
 5,000,000>0number of expressed RNA molecules simulated
EXPRESSION_K
-0.6exponent of the expression power law ("Pareto coefficient")
EXPRESSION_X0
9,500controls the exponential decay
EXPRESSION_X1
9,5002controls the exponential decay

The Distribution of Gene Expression Levels

The cell group of the experiment is assigned a random expression profile where not necessarily all transcripts of the reference are expressed. Expression levels y   are connected with the relative expression rank by a mixed power exponential law of the general form

where denotes the rank number of a gene,  the expression level of the highest abundant gene, and is the exponent of the underlying power law, and respectively  control the exponential decay. The Flux Simulator assigns to the transcripts in the reference annotation randomly expression ranks .

Subsequently, these ranks are turned into numbers of virtual molecules by the modified Zipf's Law above. Usually, p

art of the transcripts from the reference annotation will remains unexpressed.

Transcript Modifications during Expression

After the number of RNA molecules has been determined for each transcript, in silico expressed transcripts are assigned individual variations in transcription start and the length of the attached poly-A tail. The FLUX SIMULATOR modeles differences in transcription start are modelled by random variables under an exponential model with a mean around 10nt. During poly-adenylation in the nucleus usually 200-250 adenine residues get added to the primary transcript. Disregarding other poly-adenylation mechanisms, as cytoplasmatic polyadenylation, and the exact mechanisms of degrading processes by exo- and endonucleases, our model describes poly-A lengths by randomly sampling under a Gaussian distribution with a mean of 125nt and shape adapted s.t. >99.5% of the random variables fall in the interval [0;250].

 

Requires: PRO_FILE_NAME Column 1-4,
Outputs: PRO_FILE_NAME Column 5 (relative abundance) and 6 (molecule count), both after gene expression

  • No labels