Is there an option, that would generate reads from the .pro file, defined by a user?
Say, I have a .pro file in the right format (say, by tweacking the file produced by the flux-simulator)
and want the next simulation to be done according to this file.
I didn't find such option in the description, did I miss it or there is no such option at the moment?
Thanks
1 Comment
Micha Sammeth
Hi,
the simulation pipeline is "interruptable" (and continuable) at 2
intermediate points: (1) simulated expression, and (2) simulated
library. After (1), it is possible to just continue with a custom .pro
file. To create one, I recommend to run once the first step of the
pipeline (-x command line parameter) and to edit the last two columns
(nr. 5 and 6) of this file. To be honest, only column 6 is to be set
to the molecule counts you would like to assign to every expressed
transcript, the relative fraction is written out more for
informational purposes (but you cannot omit the column). Refer to
.PRO Transcriptome Profile
for a description of the .pro file format, but be aware that after
step (1) it is only filled up to column 6. Once you have a custom
profile, you can continue the pipeline by specifying the -ls command
line flag (-l for library generation, and -s for sequencing) and the
program should pick up the simulation with your custom profile.
To continue after step (2) is a bit more tricky, here you would need
additionally to a custom .pro file (now filled up to column 8), also a
.lib file that describes the fragments obtained from the expressed RNA
molecules. That is IMHO not easy to set up by a custom procedure, but
if provided, the program would also accept it and you could continue
with just providing the -s flag for sequencing. However, the
breakpoint after library generation is actually more planned to be
able to produce different simulated sequencing runs from the same
library.
Finally, during sequencing the last columns of the .pro file are
filled up. However, it does not make very much sense to fill them with
custom value because there is no subsequent step in the pipeline that
would take them as input.
Let me know if you run into trouble.
Best,
Micha