Intro
This module will load a fasta formatted file and query each fasta sequence for blast.
The user may add blast parameters as space separated list after the sequence name. All queries are
listed into a log table. The user can either let the program running while waiting for the results
using the -C option, or quit and check if the results are ready later using -W -t
Examples
$cat input.fa
>sequence1
GCGAAGCCCAAGAGGATGAAGCCAGAGATGGTGTTGGAGTTGCTGGGGCTGCTGAGGGTATTGATCTGTCTGTGACCTGCGATAGCATCAGAAGTTGTTTCACATTCTAGTTATAGCTGAGGGAGGTTATGTTTTGAGCAAGCAGGAAAC
>Sequence2
AGCTCCTGAGAAACTTGGGGGGCGCGACACAGATAGGGTGAAAGCAGAGTGATAGACCTGGGATGGTTACGGGACCAAGGGAAGACCAGGCTGGTTGGCATACACCGGTGAACGGATGGGAGTCCTAGGGAAAGATGATGCGCCTAACAG
>sequence2_filtered database='nt' filter="T" nucl_penalty=-5 gapcosts='1,11'
AGCTCCTGAGAAACTTGGGGGGCGCGACACAGATAGGGTGAAAGCAGAGTGATAGACCTGGGATGGTTACGGGACCAAGGGAAGACCAGGCTGGTTGGCATACACCGGTGAACGGATGGGAGTCCTAGGGAAAGATGATGCGCCTAACAG
>sequence3
TCGTTTGATTCTGCAAGCAGCACCTACTGTGGGGTATTGATAAGATCTCTGATGGCGTCTGAAATTCTTCTGAGATTAGAGGAAGATCAGGTGTGTTTTAATGTCGAGCAGGTGTTTCCCCAAGATTAGTGGGGGGATTCGGTTTTTCCT
$blasto -S -f /usr/home/JDoe/project1/input.fa -o /usr/home/JDoe/project1/run1
$blasto -W -t /usr/home/JDoe/project1/run1.queryTable.tsv -o /usr/home/JDoe/project1/run1
Help
usage: blasto [-h] [-S] [-C] [-W] [-f INPUTFASTA] [-t INPUTTSV]
[-o OUTPUTPREFIX] [--format_type FORMAT_TYPE]
[--sleepTime SLEEPTIME] [--description]
This module will load a fasta formatted file and query each fasta sequence for
blast The user may add blast parameters as space separated list after the
sequence name. All queries are listed into a log table. The user can either
let the program running while waiting for the results using the -C option, or
quit and check if the results are ready later using -W -t <queryTable.tsv>
optional arguments:
-h, --help show this help message and exit
-S, --submitFromFasta
Read in fasta file and submit blast queries. Write out
submitted query IDs. (default: False)
-C, --continueThrough
Read from fasta file, submit and continue checking.
Write results when they are ready and exit after all
results are finished. (default: False)
-W, --checkAndWriteResults
Read query IDs from tsv and check status. If results
are ready, collect and safe. (default: False)
-f INPUTFASTA, --inputFasta INPUTFASTA
Fasta formatted input file containing one or more
input sequences. The sequence name may contain
additional blast paramers, (default: )
-t INPUTTSV, --inputTsv INPUTTSV
Tab separated input file containing sequence IDs,
output prefix, query IDs, query arguments. (default: )
-o OUTPUTPREFIX, --outputPrefix OUTPUTPREFIX
Output prefix. All files will start with this prefix,
blast output files will be written two
<prefix>_<sequenceID>.<format_type> (default: )
--format_type FORMAT_TYPE
format of the blast output (default: Tabular)
--sleepTime SLEEPTIME
time to wait before checking again if your jobs are
done, only active if -C is on (default: 60)
--description Get a description of what this script does. (default:
False)