A simple answer to this question would be: 'That depends on what you want to simulate.' The basic operation of the Flux Simulator requires transcript annotations in the form of a GTF file. Then, if read sequences–and potential biases–are to be simulated, also a set of genomic reference sequences (one per chromosome) is required. Finally, some parameters allow to specify empiric data, i.e., insert size distributions, sequence biases, etc. deduced from experimental evidence. |
The Flux Simulator obtains transcript sequences from the genomic sequence and a transcriptome annotation. Although these seem together to reproduce the information that could be provided by solely a file with transcript sequences, there are the following practical advantages that made us to prefer the genomic sequence to the transcribed seuquence:
Therefore, in order to employ transcript sequences with the Flux Simulator, they are to be mapped to a corresponding genome. There are several programs available to align transcribed sequences to the genome they originate from, some popular ones are Blat or Exonizer. |