Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Section

Given a directory with genomic sequences split by chromosome GEN_DIR, Flux Simulator provides the possibility to additionally output the read sequences in FASTA or FASTQ format. If no error model ERR_FILE is provided, read sequences are an exact copy of the genomic sequence. Sequences of reads that are sequenced in antisense to the cDNA molecule are reverse complemented. Parts of the read that fall into the poly-A tail are correspondingly filled with a, respectively t characters whenever the read is produced in antisense direction. As in the BED file, the read identifiers are unique tags, composed of locus, transcript and fragment information from which they have been derived.

Example

Section

A BED line

Code Block
Chr1    28795    28871    Chr1:23259-31337W:AT1G01046.1:1:207:65:258:A    0    -    .    .    0,0,0    1    76    0

translates in a FASTA file to the line tuple 

 

Code Block
>Chr1:23259-31337W:AT1G01046.1:1:207:65:258:A
AACAAAGAAGCGTTAATTTATCGGTTATATCATTAAATTGTTAAAGTGAAAAGAATTTCTTATAACCTGACTGTTC

 

and can produce the FASTQ lines

 

Code Block
@Chr1:23259-31337W:AT1G01046.1:1:207:65:258:A
AACAAAGAAGCGTTAATTTATCGGTTATATCATTAAATTGTTAAAGTGAAAAGAATTTCTTATAACCTGACTGTTC
+
IIIIHIIIIIIIIIIIIIIHG<2BBIIIIIIFIIIIE<BEHIIIIBGDDIHFG<ACCCCCDD:66CFEGHIFFBHI

 

 

 

...