find_alignments
find_alignments
takes the output of a blast search (as formatted by blastresults_to_csv.py), checks each identified location for the presence of a gene, and annotates if found.
Please refer to Annotating the rhesus macaque IGH locus for example usage of this and the other ‘individual’ commands.
Find valid genes in a contig given blast matches
usage: find_alignments [-h] [-species SPECIES] [-motif_dir MOTIF_DIR] [-ref REF] [-align ALIGN] [-locus LOCUS] [-sense SENSE] [-debug]
germline_file assembly_file blast_file output_file
Positional Arguments
- germline_file
reference set used to produce the blast matches
- assembly_file
assembly or contig provided to blast
- blast_file
results from blast in the format provided by blastresults_to_csv (can contain wildcards if there are multiple files, will be matched by glob)
- output_file
output file (csv)
Named Arguments
- -species
use motifs for the specified species provided with the package
- -motif_dir
use motif probability files present in the specified directory
- -ref
ungapped reference to compare to: name and reference file separated by comma eg mouse,mouse.fasta (may be repeated multiple times)
- -align
gapped reference file to use for V gene alignments (should contain V genes only), otherwise de novo alignment will be attempted
- -locus
locus (default is IGH)
- -sense
sense in which to read the assembly (forward or reverse) (will select automatically)
- -debug
produce parsing_errors file with debug information
Default: False