Page History

Parameter

Parameter

...

NBMOLECULES

Name

Variable

Default

Value

Parameter

Range

Description

REF_

FILE

5,000,000

>0

number of expressed RNA molecules simulated

EXPRESSION_K

-0.6

exponent of the expression power law ("Pareto coefficient")

EXPRESSION_X0

9,500

controls the exponential decay

EXPRESSION_X1

9,500²

controls the exponential decay

The Distribution of Gene Expression Levels

Section
The cell group of the experiment is assigned a random expression profile where not necessarily all transcripts of the reference are expressed. Expression levels $y$ are connected with the relative expression rank $x$ by a mixed power exponential law of the general form

Section

where denotes the rank number of a gene, the expression level of the highest abundant gene, and is the exponent of the underlying power law, and respectively control the exponential decay. The Flux Simulator assigns to the transcripts in the reference annotation randomly expression ranks .

Subsequently, these ranks are turned into numbers of virtual molecules by the modified Zipf's Law above. Usually, p

art of the transcripts from the reference annotation will remains unexpressed.

Transcript Modifications during Expression

Section

After the number of RNA molecules has been determined for each transcript, in silico expressed transcripts are assigned individual variations in transcription start and the length of the attached poly-A tail. The FLUX SIMULATOR modeles differences in transcription start are modelled by random variables under an exponential model with a mean around 10nt. During poly-adenylation in the nucleus usually 200-250 adenine residues get added to the primary transcript. Disregarding other poly-adenylation mechanisms, as cytoplasmatic polyadenylation, and the exact mechanisms of degrading processes by exo- and endonucleases, our model describes poly-A lengths by randomly sampling under a Gaussian distribution with a mean of 125nt and shape adapted s.t. >99.5% of the random variables fall in the interval [0;250].

...

	file from which the reference annotation (GTF format) is read
LOAD_CODING	true	{true,false}	flag to dis-/consider transcripts that have an annotated coding sequence
LOAD_NONCODING	true	{true,false}	flag to dis-/consider transcripts that are annotated to be non-coding
PRO_FILE			file to which the simulated expression values are written
LIB_FILE			file to which the expressed transcript molecules are written

Overview

Section

The "Gene Expression" step employs the annotation specified by REF_FILE_NAME and creates an artificial expression profile of the described transcripts. By the flag LOAD_CODING transcripts with an annotated coding sequence are taken into account, by LOAD_NONCODING correspondingly those which don't. Results from in silico gene expression are stored in the files specified by the parameters PRO_FILE respectively LIB_FILE; if no explicit values are provided for these parameters, then the corresponding files are created in the folder of the parameter file.

Details

Section

Children Display

all	true
style	h4
sort	title
excerpt	true

Space shortcuts

Child pages

Versions Compared

Old Version 14

New Version Current

Key

Parameter

The Distribution of Gene Expression Levels

Transcript Modifications during Expression

Overview

Details