Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

CHRPOSIDREFALTQUALFILTINFOFORMATI1I2
2014373,14374

SNP1,SNP2

A,C

C,G PASS,q-1000SSID=-14370^20;MOD=ALT;ALT1VAR1=SNP1;ALT2VAR2=SNP2;SEQ=

TTGTACGTG,ttgtaGgtg,ttgtCcgtg;SCORE=-10001,1.5277311,3.4223458

GT0|10|0
...          

Description of variations of and extensions to the standard VCF alpha version

AttributeDescription
POScomma-separated list with the position(s) of each variant impacting the splice site, provided as the first/last position included in the adjacent exon (cf. AStalavista default coordinates)
IDstring identifying the splice site uniquely, composed by strand, genomic coordinate, site type symbol (cf. AStalavista conventions) and chromosome ID
REFthe reference sequence of the splice site, as obtained by extracting the corresponding sequence stretch from the genome
IDcomma-separated list with the ID(s) of each variant impacting the splice site (same ordering as in POS)
REFcomma-separated list with the reference string of each variant impacting the splice site (same ordering as in POS)
ALT

comma-separated list with the variant string of each variant in single (same ordering as in POS)

Warning

A

ALT

comma-separated list of the

corresponding splice site sequences after applying the corresponding genetic variant(s) to the reference sequence
QUALthe score of the reference splice site sequence (as shown in REF)

variants (and all info deferred from them) can lead to ambiguous results if one of the variants already describes multiple alternatives, e.g.

... rs6040355 A G,T ...

... microsat1 GTCT G,GTACT ...

as provided as examples on the VCF definition page.

QUAL

comma-separated list of the quality for the corresponding assertions in ALT

Warning

Ambiguous in the case of variants with multiple alternatives, as above.

FILT

comma-separated list whether the variant position has passed the filtering

Note

As long as there is only one value per variant/SNP, and not per alternative/ALT, then there should be no problem.

FILT"q-1000" when the splice site sequence has not observed in the training set (-Infinity, in practice represented by a value << -1000), "PASS" otherwise

INFO

MOD: either alternative (ALT) or constitutive (CON) splice site

ALTx VARx: (combinations of) variants that form each alternative variant (same ordering x as in ALT column)other columns POS, ALT, ...)

SEQ: splice site sequence(s) for the reference and all variants applied described by the VARx attributes

SCOREVAR_SCORES: comma-separated list of scores assigned to the corresponding variant(s); the ordering corresponds to the one used for ALTSNPS: concatenation of all variants considered for the description of this splice site (line)first the score of the reference site, and then of all variants in the usual ordering