Hello,
I am trying to look at some of the actual reads that mapped to a particular gene using the GEUVADIS data set.
For example, lets say I have a bed file for one individual (SAMPLE1.bed) and the gencode annotation file (GENCODE.v12.gtf). I would like the output to be bed file oriented like this:
Mappedread_ID Mappedread_Chr Mappedread_start Mappedread_end Mappedread_sequence Gene_ID Gene_chromosome Gene_start Gene_end
HWI-ST:XXXXX 1 15000 15020 ATTTATATGATTTATATAT ENSG000001234 Chr 1 14000 16000
With my limited understanding about flux-capacitor, it seems to me that its main purpose is to generate read counts for each gene and does not return the actual read ID's which mapped to a certain gene.
I managed to do this in bedtools by using the following command, but wasn't sure if bedtools and flux-capacitor are operating in the same fashion.
bedtools intersect -a SAMPLE1.bed -b GENCODE.v12.gtf -wb > output.txt
Is the equivalent to the above possible using flux capacitor ?
Thanks,
Jin
2 Comments
Micha Sammeth
The intersection between a gtf file of the reference annotation and the reads without further ado will probably retrieve all reads (mates) that overlap exons. The annotation mapping of flux is slightly different, it employs reads that fall within exons and form valid pairs. As there is currently no user option to output reads and their sequences, I filed a Request for Improvement
I hope we can soon provide a solution.
Best,
Micha
Jin
Thank you very much for your reply!
I will keep my eyes open for any updates .
Cheers,
Jin