The α-format described here has been applied to describe the splice site variants of the Geuvadis Project.

CHRPOSIDREFALTQUALFILTINFOFORMATI1I2
2014370

-14370^20

TTGTACGTG

ttgtaGgtg,ttgtCcgtg-10001q-1000MOD=ALT;ALT1=SNP1;ALT2=SNP2;VAR_SCORES=1.5277311,3.4223458;SNPS=SNP1,SNP2GT0|10|0
...          

Description of variations of and extensions to the standard VCF

AttributeDescription
POSthe position of the splice site, provided as the first/last position included in the adjacent exon (cf. AStalavista default coordinates)
IDstring identifying the splice site uniquely, composed by strand, genomic coordinate, site type symbol (cf. AStalavista conventions) and chromosome ID
REFthe reference sequence of the splice site, as obtained by extracting the corresponding sequence stretch from the genome
ALTcomma-separated list of the corresponding splice site sequences after applying the corresponding genetic variant(s) to the reference sequence
QUALthe score of the reference splice site sequence (as shown in REF)
FILT"q-1000" when the splice site sequence has not observed in the training set (-Infinity, in practice represented by a value << -1000), "PASS" otherwise
INFO

MOD: either alternative (ALT) or constitutive (CON) splice site

ALTx :(combinations of) variants that form each alternative variant (same ordering as in ALT column)

VAR_SCORES: comma-separated list of scores assigned to the corresponding variant(s); the ordering corresponds to the one used for ALT

SNPS: concatenation of all variants considered for the description of this splice site (line)