parse_imgt_annotations

parse_imgt_annotations downloads an IMGT annotation file or or uses a file already downloaded. It parses the file to provide a list of annotated features. Optionally it will also store the file downloaded, and create a FASTA file containing the annotated assembly. Please refer to Annotating the human IGH locus for example usage.

As of writing, IMGT has implemented a ‘bot checker’ mecchanism which means that automated downloads of annotation files are blocked. Unless this is modified in the future, you will need to download the annotation file manually from IMGT, and provide it to this tool using the -input_file argument. To do this:

  1. Browse to the URL https://www.imgt.org/ligmdb/view.action?format=IMGT&id=<id> where <id> is the IMGT identifier of the sequence you wish to download annotations for. For example, for the rhesus macaque IGH sequence with IMGT ID IMGT000064, the URL is https://www.imgt.org/ligmdb/view.action?format=IMGT&id=IMGT000064.

  2. Check that you are on the IMGT View (see the list of near the top of the page, and click IMGT if necessary).

  3. Press the Download button under the list of views (if you do not see the download button, check step 2).

Now proceed to run the parse_imgt_annotations tool with the -input_file argument set to the path of the file you just downloaded.

Given a set of IMGT annotations, build a CSV file containing gene names and co-ordinates

usage: parse_imgt_annotations [-h] [--save_download SAVE_DOWNLOAD] [--save_sequence SAVE_SEQUENCE] [--save_imgt_annots SAVE_IMGT_ANNOTS] imgt_url outfile locus

Positional Arguments

imgt_url

URL of IMGT annotation, e.g. http://www.imgt.org/ligmdb/view?format=IMGT&id=IMGT000064, or name of text file containing its contents

outfile

Output file (CSV)

locus

one of IGH, IGK, IGL, TRA, TRB, TRD, TRG

Named Arguments

--save_download

Save contents of annotation to specified file

--save_sequence

Save sequence to specified file

--save_imgt_annots

Save IMGT annotations to specified file