Child pages
  • 3.3 - Tool SCORER (Splice Site Scoring)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Gene Models (annotation in GTF format), REQUIRED: if missing, the program responds with an error like

    Code Block
    Hey, you forgot to specify a valid input file!
    This is a bit important, I cannot work without an input annotation. I want a GTF file with transcript annotations (exon features, with a mandatory optional attribute named 'transcript_id') IN THE SAME COLUMN (i.e., if the transcript identifier of the 1st line is in column #10, it has to be in all lines of the file in column #10. The rest of the file should comply with the standard as specified at http://mblab.wustl.edu/GTF2.html.
  2. Chromosome sequences (in atomary FASTA files) REQUIRED: if missing, the program responds with an error like

    Code Block
    [ERROR] Splice site scoring requires the genomic sequence, provide a value for parameter 'CHR_SEQ' in the parameter file, or via the command line flags -c or --chr!
    Warning

    The chromosome sequences currently have to be provided as separate files, one per chromosome. All of these files have to be in the same folder (e.g., genomes/H.sapiens/hg19) with a filename prefix that corresponds to the tags in column $1 of the GTF filel provided and a suffix ".fa" or ".fasta"; e.g., if chromosomes are named "chr1", "chr2", etc. then the program expects files named "chr1.fa", "chr2.fa", ...

    The first line of every

  3. Genetic variants (as a pseudo VCF file)

    Section

    Modified bases can be provided in a 5-column file format that resembles the variant-calling-format (VCF): column 1 is chromosome ID (number or letter), column 2 is position within the chromosome (integer), column 3 is variant identifier, column 4 is reference nucleotide and column 5 is the variant nucleotide.

    Code Block
    1	948921	rnaedit_1_948921	T	C
    1	982994	rnaedit_1_982994	T	C
    1	990773	rnaedit_1_990773	C	T
    1	1158631	rnaedit_1_1158631	A	G
    1	1247494	rnaedit_1_1247494	T	C
    1	1309405	rnaedit_1_1309405	T	C
    1	1336006	rnaedit_1_1336006	C	T
    1	1336626	rnaedit_1_1336626	G	A
    1	1342394	rnaedit_1_1342394	G	A
    1	1684472	rnaedit_1_1684472	C	T
    Info

    Characters in the first column of the VCF file have to correspond to the suffixes of the chromosome names of the (1) gene annotation and (2) chromosome files, removing the prefix "chr".