Intron model files describe the format of splice site combinations that are considered as potential intron. Discriminatory attributes of biological introns are (1st) the distance of the donor/acceptor pair, (2nd) the combination of their splice site sequences. Each model block is introduced by a header line.
#MODEL minDist maxDist
where #MODEL introduces a new model, and minDist respectively maxDist delimit the boundaries on the lengths of valid introns that are described by the model. Subsequently, a list of donor/acceptor sequences that may co-occur in valid introns is provided.
donorSeq1 | acceptorSeq1 |
donorSeq2 | acceptorSeq2 |
… | … |
The sequences are the strings directly adjacent to exons, and may be redundant — as combinations are evaluated — as their length may vary, even amongst donors and acceptors.