find_alignments

find_alignments takes the output of a blast search (as formatted by blastresults_to_csv.py), checks each identified location for the presence of a gene, and annotates if found. Please refer to Annotating the rhesus macaque IGH locus for example usage of this and the other ‘individual’ commands.

Find valid genes in a contig given blast matches

usage: find_alignments [-h] [-species SPECIES] [-motif_dir MOTIF_DIR] [-ref REF] [-align ALIGN] [-locus LOCUS] [-sense SENSE] [-debug]
                       germline_file assembly_file blast_file output_file

Positional Arguments

germline_file

reference set used to produce the blast matches

assembly_file

assembly or contig provided to blast

blast_file

results from blast in the format provided by blastresults_to_csv (can contain wildcards if there are multiple files, will be matched by glob)

output_file

output file (csv)

Named Arguments

-species

use motifs for the specified species provided with the package

-motif_dir

use motif probability files present in the specified directory

-ref

ungapped reference to compare to: name and reference file separated by comma eg mouse,mouse.fasta (may be repeated multiple times)

-align

gapped reference file to use for V gene alignments (should contain V genes only), otherwise de novo alignment will be attempted

-locus

locus (default is IGH)

-sense

sense in which to read the assembly (forward or reverse) (will select automatically)

-debug

produce parsing_errors file with debug information

Default: False