Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Section

In this example, we investigate a protocol that uses poly-dT primers to reversely transcribe mRNA molecules, that later-on are fragmented by a mechanical shearing known as nebulization. Reads Subsequently, reads are subsequently sequenced without PCR or size filtering sequenced

Section
column

.

Input

Download

Reference Annotation

Parameter File

Parameters

Expression

NB_MOLECULES

5,000,000Number of RNA molecules initially in the experiment
TSS_MEAN100Average deviation from the annotated transcription start site (TSS)
POLYA_SCALE200Scale of the Weibull distribution, shifts the average length of poly-A tail sizes
POLYA_SHAPE1.5Shape of the Weibull distribution describing poly-A tail sizes
Reverse Transcription
RTRANSCRIPTIONYESSwitch on the reverse transcription
RT_PRIMERPDTUse poly-dT primers used for first strand synthesis
RT_LOSSLESSYESFlag to force every molecule to be reversely transcribed
RT_MIN400Minimum length observed after reverse transcription of full-length transcripts
RT_MAX2,600Maximum length observed after reverse transcription of full-length transcripts
Fragmentation
FRAG_SUBSTRATEDNASpecifies DNA as the substrate of fragmentation
FRAG_METHODNBNebulization as fragmentation method
FRAG_NB_LAMBDA600

Threshold on molecule length that cannot be broken by the shearfield of nebulization

FRAG_NB_M5

...

Strength of the nebulization shearfield (i.e., rotor speed)
Amplification and Size Segregation
PCR_DISTRIBUTIONnoneDisable PCR amplification
GC_MEANNaNDisable GC bias
FILTERINGNODisable size filtering
Sequencing

...

READ_NUMBER2,000,000Produce 2 million reads
READ_LENGTH100Each read sequence is 100nt long
PAIRED_END

...

NOSingle reads are simulated, one per fragment

...

Output

Code Block
[INFO] I am collecting information on the run.
[INFO] Checking GTF file
*[WARN] Unsorted in line 27 - chr/strand Chr1 + already read.
********* OK (
371481
371580:
13948
28662:
59
09)
[GTF FILE] The GTF reference file given is not sorted, but we found a sorted version.
[GTF FILE] The Simulator will use /Users/micha/Desktop/TAIR9_GFF3_genes_sorted.gtf
[GTF FILE] You might want to update your parameters file
[PROFILING] I am assigning the expression profile
********** OK (
371481
371580:
13948
28662:
59
09)
    
Reading reference annotation **[WARN] merging exon (-21073927,-21073974) with exon (-21073898,-21073924) in transcript AT1G56280.1 because intervening intron has 4 or less nt.
********[WARN] skipped chromosome ChrM
 OK (00:00:03)
    
found 38564 transcripts
[PROFILING] Parameters
    
NB_MOLECULES    5000000
    
EXPRESSION_K    -0.6
    
EXPRESSION_X0    5.0E7
    EXPRESSION_X1    9500.0
    PRO_FILE_NAME
/Users/micha/Desktop/t9_nebulization.pro profiling ********** OK (00:00:00) Updating .pro file ********** OK (00:00:00) molecules 4999389 [LIBRARY] creating the cDNA libary Initializing Fragmentation File ********** OK (00:00:06) 4999389 mol initialized [LIBRARY] Reverse Transcription [LIBRARY] Configuration Mode: PDT PWM: No RT MIN: 400 RT MAX: 2600 Processing Fragments ********** OK (00:00:17) 4999389 mol: in 4999389, new 0, out 4999389 avg Len 1148.562, maxLen 2600 [LIBRARY] Nebulization [LIBRARY] Configuration Lambda: 600.0 M: 5.0 Max Length: 2600.0 Recursions: 5 Processing Fragments ********** OK (00:00:32) 8804186 mol: in 4999389, new 3804797, out 8804186 avg Len 652.202, maxLen 2427 start amplification [LIBRARY] PCR disabled, skipping amplification Copied results to /Users/micha/Desktop/t9_nebulization.lib Updating .pro file ********** OK (00:00:00) [SEQUENCING] getting the reads Initializing Fragment Index Indexing ********** OK (00:00:10) 8804186 lines indexed (8804186 fragments, 18951 entries) sequencing ***[WARN] merging exon (-21073927,-21073974) with exon (-21073898,-21073924) in transcript AT1G56280.1 because intervening intron has 4 or less nt. *******[WARN] skipped chromosome ChrM OK (00:10:22) found 38564 transcripts [PROFILING] Parameters NB_MOLECULES 5000000 EXPRESSION_K -0.6 EXPRESSION_X0 5.0E7 EXPRESSION_X1 9500.0
 
PRO_FILE_NAME
   /Users/micha/Desktop/t9_nebulization.pro
    profiling ********** OK (00:00:00)
    
Updating .pro file  ********** OK (00:00:00)
    
molecules    
4999389
4999395
[LIBRARY] creating the cDNA libary
    Initializing Fragmentation File ********** OK (00:00:
06
04)
    4999395 
4999389
mol initialized
[LIBRARY] Reverse Transcription
[LIBRARY] Configuration
        
Mode: PDT
        PWM: No
        RT MIN: 400
        
RT MAX: 2600
    Processing Fragments ********** OK (00:00:
17
18)
        
4999389
4999395 mol: in 
4999389
4999395, new 0, out 
4999389
4999395
        
avg Len 
1148
1039.
562
7405, maxLen 2600
[LIBRARY] Nebulization
[LIBRARY] Configuration
        Lambda: 600.0
        
M: 5.0
        
Max Length: 2600.0
        
Recursions: 
5
3
    
Processing Fragments ********** OK (00:00:
32
23)
        
8804186
7498699 mol: in 
4999389
4999395, new 
3804797
2499304, out 
8804186
7498699
        
avg Len 
652
693.
202
1967, maxLen 
2427
2590
        
start amplification
[LIBRARY] PCR disabled, skipping amplification
    
Copied results to /Users/micha/Desktop/t9_nebulization.lib
    
Updating .pro file  ********** OK (00:00:00)
[SEQUENCING] getting the reads
    Initializing Fragment Index
    
Indexing ********** OK (00:00:
10
09)
    
8804186
7498699 lines indexed (
8804186
7498699 fragments, 
18951
18849 entries)
    
sequencing ***[WARN] merging exon (-21073927,-21073974) with exon (-21073898,-21073924) in transcript AT1G56280.1 because intervening intron has 4 or less nt.
*******[WARN] skipped chromosome ChrM
 OK (00:
10
08:
22
37)
    
7498699 
8804186
fragments found (
8804186
7498699 without PCR duplicates)
    2001148 
3995704
reads sequenced
    0 
1251847
reads fall in poly-A tail
    
3770
42854 truncated reads
    
Moving temporary BED file
    
Updating .pro file  ********** OK (00:00:00)
    
Updating .pro file  ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)
    Updating .pro file  ********** OK (00:00:00)
[END] I finished, took me 
695
579 sec.

 

 

 

 

 

...