Hello Michael,

Here is a modified extract from my *.pro file. Since my reference transcript length is 1688bp, and the covered fraction as shown, the actual covered length of the transcript comes to 588bp (am I right there?).

locustranscript_IDlengthexpressed fractionexpressed numbersequenced fractionsequenced numbercovered fraction


My read headers from the given transcript are:



My questions are:


  • Are the coordinates 1-based or 0-based? Simply put, are the first fragment coordinates [10,276] or (10,276] in the first pair of reads?

  • What does the length 588bp refer to?


Thanks for your time and help.

  • No labels


  1. Hi!

    Sorry for the delay in responding. Yes, in your example the covered nucleotides would add up to a total length of 588nt.

    0.348341226577759 * 1688 = 588

    However, these 588nt do not have--and are unlikely--to be consecutive on the transcript sequence. More formally, the covered fraction  is computed

    where would be the length of the transcripts, and is the indicator fuction whether a position  is covered by at least one read , or not . Therefore, the distribution of the covered positions can be discontinous, as for instance

    X X  XXX   XX X

    where X would mark covered positions .


    To your other question, the coordinates in all library files and also in the read output are 0-based. That falls back to their initialization in Fragmenter.processInitial() which currently reads

            // 0-based tx coordinates
            int start = 0;
            int end = origLen - 1;

    Note that, when using variation in transcription start or poly-A tails, you may find negative coordinates or coordinates .



    1. Hi Micha,

      Thanks for the detailed reply. 

      I have a few more clarifications: 

      1. 0.348341226577759 * 1688 = 587.999990463 exactly. Do the trailing decimal figures have any significance or these can be safely rounded off?
      2. Following are the pair of reads for a transcript whose *.pro entry is listed:

      Chr3:13579593-13580782C AT3G33045.1 NC 1190 6.000702082143611E-7 3 0.0 0 6.593739903335773E-7 2 0.1260504275560379 0 NaN

      The covered length comes to 1190* 0.1260504275560379 = 150. But the reads are 76bp each, non-overlapping. The covered length should come to 152. I have observed this in all my reads, and the case above as well (588bp one). Could you please help me figure what's amiss here?

      Thanks a lot for your help.