Table of Contents |
---|
The parameters without brackets are mandatory for the respective mode. Otherwise, it is optional. Parameters between pipe ("|") are mutually exclusive.
Code Block |
---|
astalavista -t astafunk [--verbose] [--cpu <INT>] [--all | -g] [--local] [-o <INT>] --genome <GENOME_DIR> --gtf <GTF_FILE> --hmm <HMM_FILE> --reference|-r <REFERENCE_FILE> |
Code Block |
---|
astalavista -t astafunk [--verbose] [--cpu <INT>] [--local] [-o <INT>] --const --genome <GENOME_DIR> --gtf <GTF_FILE> --hmm <HMM_FILE> --reference|-r <REFERENCE_FILE> |
Observation: On AS genes, the current version of this mode searches constitutive domains only on reference transcript (longest ORF).
Searches exhaustively the HMM database against the variant sequences, i.e., without a reference domain file.
Code Block |
---|
astalavista -t astafunk [--verbose] [--cpu <INT>] [--all | -g] [--local] [-o <INT>] -e|--exh --genome <GENOME_DIR> --gtf <GTF_FILE> --hmm <HMM_FILE> |
Code Block |
---|
astalavista -t astafunk [--verbose] [--cpu <INT>] [--local] [-o <INT>] --naive --genome <GENOME_DIR> --gtf <GTF_FILE> --hmm <HMM_FILE> --reference|-r <REFERENCE_FILE> |
Code Block |
---|
astalavista -t astafunk --tref --genome <GENOME_DIR> --gtf <GTF_FILE> |
Code Block |
---|
astalavista -t astafunk [--local] [-o <INT>]--test --hmm <HMM_FILE> --fa <SEQUENCE_FILE> |
<REFERENCE_FILE> is computed by hmmsearch of the HMMER program, using the command line below:
Code Block |
---|
$> hmmsearch --cut_ga --domtblout <REFERENCE_FILE> <HMM_FILE> <REFERENCE_TRANSCRIPTS.fasta> |
hmmsearch is the HMMER algorithm (hmmer.org) to search one or more profiles (from the Pfam-A.hmm database) against the amino acid sequences of reference transcripts (in the <REFERENCE_TRANSCRIPTS>.fasta, see help below). The parameter --cut_ga is that hmmsearch uses gathering domain thresholds stored in the HMM profiles during predictions. The --domtblout output saves a parseable table of per-domain hits to <REFERENCE_FILE>. The reference transcript is the transcript with the longest ORF of a gene.
AstaFunk includes a feature to generate a multi-fasta file with the amino acid sequences of reference transcripts for a given annotation.
Firstly, you execute ASTAFUNK to print on standard output (redirected to the file <REFERENCE_TRANSCRIPTS.fasta>) the amino acid sequences of the reference transcripts. A reference transcript is the transcript with the longest Open Reading Frame (ORF) of an alternatively spliced gene.
Obtain the reference transcript FASTA file with the command:
Code Block |
---|
$> astalavista -t astafunk --tref --genome <GENOME_DIR> --gtf <GTF_FILE.gtf> > <REFERENCE_TRANSCRIPTS.fasta> |
According to RefSeq (NM_003283),
This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration.
Code Block | ||
---|---|---|
| ||
$> astalavista -t astafunk --tref --gtf tnnt1.gtf --genome ~/example/genome/ > reference_tx.fasta |
Code Block | ||
---|---|---|
| ||
$> hmmsearch --domtblout reference_file ~/Databases/Pfam/Pfam-A.hmm reference_tx.fasta |
Tip | ||
---|---|---|
| ||
or skip these commands and use directly the whole database Pfam-A.hmm as parameter for the option [–hmm]. |
Code Block | ||
---|---|---|
| ||
astalavista -t astafunk --genome ~/example/genome/ --gtf tnnt1.gtf --reference reference_file --hmm database.hmm |
Code Block | ||
---|---|---|
| ||
astalavista -t astafunk --const --genome ~/example/genome/ --gtf tnnt1.gtf --reference reference_file --hmm database.hmm |