Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Section

The Flux Simulator supports quality error models to simulate sequencing errors. The current version of the Flux Simulator comes provides two models, one for reads of length 76 and one for read length 36. The models can also be used for shorter reads, but the model generator is included and allows the creation of custom models.

Applying the error model consists of two steps. First, a quality value is assigned to each position in the read. Second, a cross-talk table is used to check if the base with the given quality should be mutated and if so, to which base. The quality assignment is based on a Markov model that uses both the position n in the read and the quality

 

value of the n-1 positions.

The quality model can be configured using the simulators parameter file. In order to activate quality sampling and sequence errors, set the ERR_FILE parameter to either "76" or "35" to use the build in models. Alternatively you can also specify a file name to load a custom model. In addition, also set FASTA true in the parameter file, and the simulator will output a .fastq file inluding qualities and sequencing error.

Creating a custom model

The flux simulator comes with a model generator that is currently based on GEM mapping files produced by the GEM-MapperAdditionally, error models may be provided. Currently only position-specific errors are supported, see the original discussion on that topic. In short, positional error models base on the simple idea that "problems" that are observed in a certain region of the generated sequence cluster accross multiple reads of the output as they have been influenced by a common temporal and spatial problem during sequencing. Such errors can be estimated empirically after aligning the sequences of a run to the genomic reference sequence and identifying mismatching positions. Especially suited for obtaining such estimations are alignment programs that are not affected by strict limitations on the number of mismatch and hence allow for more complete pictures. The FLUX SIMULATOR provides an automatic error model estimation for alignments output by the GEM aligner. Error models are stored in form of ERR files.