You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

The PAR format in the Flux Simulator is used to administrate all parameters of a run. It is a simple format containing key value pairs (one per line) with the following parameter names (i.e., keys):

File Locations

KeyValueDescription
REF_FILE_NAMEString

Path to the GTF reference annotation, either absolute or relative to the location of the parameter file

PRO_FILE_NAMEString

Path to the profile of the run, either absolute or relative to the location of the parameter file; the default profile uses the name of the parameter file with the extension .pro.

LIB_FILE_NAMEString

Path to the library file of the run, either absolute or relative to the location of the parameter file; the default profile uses the name of the parameter file with the extension .lib.

SEQ_FILE_NAMEString

Path to the sequencing file of the run, either absolute or relative to the location of the parameter file; the default profile uses the name of the parameter file with the extension .bed.

GEN_DIRString

Path to the directory with the genomic sequences, i.e., one fasta file per chromosome/scaffold/contig with a file name corresponding to the identifiers of the first column in the GTF annotation.

 

Expression

KeyValueDescription
   
   
   

 

 

KeyValueDescription
 
File locations
 
REF_FILE_NAMEStringpath to the reference annotation, either absolute or relative to the location of the parameter file
PRO_FILE_NAMEStringpath to the profile of the run, either absolute or relative to the location of the parameter file
LIB_FILE_NAME[String]path to the library file, either absolute or relative to the location of the parameter file
BED_FILE_NAME SEQ_FILE_NAMEStringpath to the bed file with the genomic annotation of the simulated sequencing reads, either absolute or relative to the location of the parameter file
GEN_DIRStringpath to the directory with the genomic sequences of chromosomes or scaffolds used in the reference annotation.
 
Expression
 
NB_MOLECULES[Integer]Number of initial RNA molecules in the simulation
LOAD_CODING[YES|NO]Flag to load coding transcripts from the reference annotation.
LOAD_NONCODING[YES|NO]Flag to load the non-coding transcripts, i.e., transcripts without CDS features, from the reference annotation
EXPRESSION_KFloatPower law parameter k   of the expression simulation, should be <0.
EXPRESSION_X0IntegerNumber of molecules for the highest expressed transcript, depends on NB_MOLECULES
EXPRESSION_X1FloatParameter determing the exponential decay in the expression simulation
RT_PRIMER[RANDOM|POLY-DT]Flag to switch between random priming and poly-dT priming for the first strand synthesis of the reverse transcription
RT_MINIntegerMinimum length (in [nt]) of the expected reversely transcribed cDNA molecules
RT_MAXIntegerMaximum length (in [nt]) of the expected reverse transcription products
FRAGMENTATION[YES|NO]Optional: flag that determines whether a fragmentation step is carried out
FRAG_B4_RT[YES|NO]flag to schedule the fragmentation before (YES), or after (NO) the reverse transcription. Note for fragmentations carried out before reverse transcription, exclusively random priming strategies are reasonable.
FRAG_MODE[PHYSICAL|CHEMICAL]flag to switch between fragmentation according to physical or chemical attributes.
FRAG_LAMBDAIntegerUpper boundary of fragment lengths (in [nt]) that are not expected to be fragmented by the applied technique
FILTERING[YES|NO]Flag to indicate whether a length filtering step is carried out on the cDNA library.
FILT_MINIntegerMinimum length that is retained during filtering.
FILT_MAXIntegerMaximum length that is retained during filtering.
READ_NUMBERIntegerNumber of reads that are intented to produce. Note: this number is an upper boundary and gets adapted to the actual size of the intermediary generated library.
READ_LENGTHIntegerLength of the generated reads, depends on filtering settings.
PAIRED_END[YES|NO]Flag to indicate whether read pairs are produced.
FASTQ[YES|NO]Flag that indicates whether additionally the read sequences and qualities are output. Depends on GENOME_DIR and ERR_FNAME.
QTHOLDIntegerQuality value below which base-calls are considered problematic.
TMP_DIRStringPath to folder for temporary files, if different from system standard (commonly /tmp on Unix clones).
  • No labels