Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Input

Download

Reference Annotation

Parameter File

Reference Genome

...

Parameter

Expression
NB_MOLECULES5,000,000Number of RNA molecules initially in the experiment
TSS_MEAN25Average deviation from the annotated transcription start site (TSS)
POLYA_SCALE300Scale of the Weibull distribution, shifts the average length of poly-A tail sizes
POLYA_SHAPE2Shape of the Weibull distribution describing poly-A tail sizes
Fragmentation
FRAG_SUBSTRATERNASpecifies RNA as the substrate of fragmentation
FRAG_METHODURUniform random fragmentation
FRAG_UR_ETA170

Average expected framgent size after fragmentations, i.e., number of breaks per unit length (exhautiveness of fragmentation)

FRAG_UR_D01Minimum length of fragments produced by UR fragmentation
FRAG_UR_DELTANaNGeometry of molecules in the UR process depends logarithmically on molecule length
Reverse Transcription
RTRANSCRIPTIONYESSwitch on the reverse transcription
RT_PRIMERRHUse random hexamer primers used for first strand synthesis
RT_MOTIFdefault

A default PWM of the current Illumina protocol is used

RT_LOSSLESSYESFlag to force every molecule to be reversely transcribed
RT_MIN500Minimum length observed after reverse transcription of full-length transcripts
RT_MAX5,500Maximum length observed after reverse transcription of full-length transcripts
Amplification and Size Segregation
PCR_DISTRIBUTIONdefaultDefault PCR distribution with 15 rounds and 20 bins
GC_MEAN0.5Mean value of a gaussian distribution that reflects GC bias amplification probability
GC_SD0.1Standard deviation of a gaussian distribution that reflects GC bias amplification probability
FILTERINGYESEnables size filtering of fragments

SIZE_DISTRIBUTION

nullEmploy an empirical Illumina fragment size distribution
SIZE_SAMPLINGMHThe Metropolis-Hastings algorithm is used for filtering
Sequencing
READ_NUMBER15,000,000Produce 15 million reads
READ_LENGTH75Each read sequence is 75nt long
PAIRED_END

...

NOSingle reads are simulated (one per fragment)

Output

...

Code Block
[INFO] I am collecting information on the run.
    initializing profiler  **********
[INFO] Checking GTF file
********** OK (00:00:
04
03)
[PROFILING] I am assigning the expression profile
********** OK (00:00:05)
    Reading reference annotation ********** OK (00:00:
07
06)
    found 28045 transcripts

[PROFILING] Parameters
    NB_MOLECULES    5000000
    EXPRESSION_K    -0.6
    EXPRESSION_X0    5.0E7
    EXPRESSION_X1    9500.0
    PRO_FILE_NAME    /Users/micha/Desktop/mm9_hydrolysis.pro

    profiling ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)
    molecules    
4999457
4999480
[LIBRARY] creating the cDNA libary
    Initializing Fragmentation File ********** OK (00:00:
05
06)
    
4999457
4999480 mol initialized
[LIBRARY] Fragmentation UR
[LIBRARY] Configuration
        D0: 1.0
        Delta:  Not specified, depends on sequence length
        Eta: 170.0

    Processing Fragments ********** OK (00:
02
03:
34
20)
        
93363234
99433550 mol: in 
4999457
4999480, new 
88363777
94434070, out 
93363234
99433550
        avg Len 154.
08191
13617, maxLen 
513
499
    preparing transcript sequences ********** OK (00:
01
02:
21
04)
[INFO] Initializing PWM cache
[INFO] Done
[LIBRARY] Reverse Transcription
[LIBRARY] Configuration
        Mode: RH
        PWM: 
No
motif_1mer_0-5.pwm
        RT MIN: 500
        RT MAX: 5500

    Processing Fragments ********** OK (00:
05
18:
32
53)
        
93363234
99436361 mol: in 
93363234
99433550, new 
0
2811, out 
93347200
99436361
        avg Len 
145
226.
28108
49129, maxLen 
506
718
        initializing Selected Size distribution
[LIBRARY] Segregating cDNA (
MCMC Filter
Acceptance)
    Processing Fragments ********** OK (00:
03
02:
14
32)
        
93347200
99436361 mol: in 
93347200
99436361, new 0, out 
49712978
3935454
        avg Len 
153
183.
1023
18074, maxLen 299
        start amplification
[INFO] Loading default PCR distribution
[INFO] Initializing PWM cache
[INFO] Done
[LIBRARY] Amplification
[LIBRARY] Configuration
        Rounds: 15 
        Mean: 0.5 
        Standard Deviation: 0.1 

    Processing Fragments ********** OK (00:
03
00:
13
19)
    Amplification done.
    In: 
49712978
3935454 Out: 
1340698400
106111525
        
49712978
3935454 mol: in 
49712978
3935454, new 0, out 
1340698400
106111525
        avg Len 
153
183.
10666
16734, maxLen 299
    Copied results to /Users/micha/Desktop/mm9_hydrolysis.lib
    Updating .pro file  ********** OK (00:00:00)

[SEQUENCING] getting the reads
    Initializing Fragment Index
    Indexing ********** OK (00:00:
30
03)
    
26681880
2112053 lines indexed (
1340698400
106111525 fragments, 
18750
16421 entries)
    sequencing ********** OK (00:
35
04:
05
11)

    
1340698400
106111525 fragments found (
26681880
2112053 without PCR duplicates)
    
29993750
15000653 reads sequenced
    
2314644
2333333 reads fall in poly-A tail
    
2538184
511978 truncated reads

    Moving temporary BED file

    Updating .pro file  ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)

[END] I finished, took me 
3252
2291 sec.