Hello
I got a strong 5' bias. First positions in a transcript have a high( very high coverage) . The window that follows has almost no reads.
I believe the figure should speak for itself. I join also the parameters file used.
I noticed that accepting variability in TSS position reduces this bias but I want to simulate data with a fixed TSS.
Any suggestions on how to reduce this bias?
Thank you,
Bogdan
5bias.png
NB_MOLECULES 50000
TSS_MEAN NaN
POLYA_SCALE 50
POLYA_SHAPE 2
## Fragmentation
FRAG_SUBSTRATE RNA
FRAG_METHOD UR
FRAG_NB_LAMBDA 50
## RT parameters
RTRANSCRIPTION YES
RT_PRIMER RH
RT_LOSSLESS YES
RT_MIN 2000
RT_MAX 6000
## PCR / Filtering
## no [GC] bias to not observe sequence based biases
PCR_DISTRIBUTION default
PCR_PROBABILITY 0.5
GC_MEAN NaN
FILTERING YES
SIZE_SAMPLING MH
# Sequencing
FASTA true
READ_NUMBER 1000000
READ_LENGTH 50
PAIRED_END NO
1 Comment
Micha Sammeth
Hi Bogdan,
thank you for sharing with us your observation. I modified the title of your post slightly, as I think we better call your observation a 5'-peak not to be confounded with a systematic 5'-bias as observed under other circumstances, for instance reported in this post.
Indeed, transcript ends (5' as well as 3') are naturally also ends of fragments, and therefore in the absence of further variation these are overrepresented in read populations; an area corresponding to the respective insert size distribution down- respectively upstream of those natural fragmentation points is likely to have less reads than on average. Because the 3'-end usually falls within the poly-A tail, the effect can be most clearly observed at the 5'-end.
Currently, the parameter TSS_MEAN actually does model biological variation in the transcription start jointly with technical artifacts like 5'end loss during extraction and RT. The model could be refined in the future, if more about the nature of each of these biases is known to deconvolve the distributions. At the moment, if you allow no variance for the 5'-end, the simulation will produce the result as observed.
Best wishes,
Micha