You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

The BED format is employed as default for describing reads produced in a Flux Simulator run by the genomic regions from which they are originating. Reads that fall partially in the poly-A tail are truncated to their respective content of genomic sequence.In contrast, reads that fall completely into the poly-A tail are described to be located on the special reference sequence 'poly-A'. The 12 tab-separated fields specified for the BED format are:

    
    
    
    
    
    
    
    
    
    
    
    
    


Example

chr1 2082 2503 chr1:4847775-4887990W:NM_001159750:1:2668:917:1137:S/2 0 - 0 0 0,0,0 2 8,28 0,393


In this example, the complete region of the read spans from 2083 (note the 0-base in BED format) to position 2503 (which is the first excluded position in BED format and therefore directly translates to the last included position in a 1-based coordinate system) on the reference sequence chr1. The the read alignment is split in two parts, one from 2083 to 2083+8-1=2090, and the other one from 2083+393=2476 to 2476+28-1=2502.

The name field denotes that the read has been the downstream mate P2 of a read pair, derived from the 105th transcript copy of the annotated uc009vip.1 structure (which has spliced length 2772) in splicing locus chr1:1116-4272W. The fragment of this transcript that has been sequenced starts at position 695 and ends at position 1003 in the spliced sequence, relative to the annotated transcription start. From this fragment, the subarea 968-1003 relative to the annotated transcription start has generated the read sequence.

  • No labels