Ho ho ho, I would like to simulate a very bad experiment. So far, the correlation between the expression level and the calculated FPKM generated from the FLUX-simulator has been too good to be true, around R-squared 0.8 to 0.9. I would like to generate an experiment full of biases (GC-bias) that has bad correlation between the gene expression level and observed reads number. Any suggestion on which parameters to tweak? THanks.

  • No labels

1 Comment

  1. Hi,

    I would suggest you play around with the amplification parameters:

    KeyTypeDefault ValueDescription
    PCR_DISTRIBUTIONStringdefault

    PCR distribution file, 'default' to use a distribution with 15 rounds and 20 bins, 'none' to disable amplification.

    PCR_PROBABILITYFloat0.1PCR duplication probability when GC filtering is disabled by setting GC_MEAN to NaN.
    GC_MEANFloat0.5

    Mean value of a gaussian distribution that reflects GC bias amplification probability, set this to 'NaN' to disable GC biases.

    GC_SDFloat0.1

    Standard deviation of a gaussian distribution that reflects GC bias amplification probability, inactive if GC_MEAN is set to NaN.

    Cheers!