multiPhATE: bioinformatics pipeline for functional annotation of phage isolates

  • Carol L Ecale Zhou
    Global Security Computing Applications Division, Lawrence Livermore National Laboratory , Fort Detrick, MD, USA
  • Stephanie Malfatti
    Global Security Computing Applications Division, Lawrence Livermore National Laboratory , Fort Detrick, MD, USA
  • Jeffrey Kimbrel
    Global Security Computing Applications Division, Lawrence Livermore National Laboratory , Fort Detrick, MD, USA
  • Casandra Philipson
    Biological Defense Research Directorate, Naval Medical Research Center , Fort Detrick, MD, USA
  • Katelyn McNair
    Computational Sciences Research Center, San Diego State University , CA, USA
  • Theron Hamilton
    Biological Defense Research Directorate, Naval Medical Research Center , Fort Detrick, MD, USA
  • Robert Edwards
    Computational Sciences Research Center, San Diego State University , CA, USA
  • Brian Souza
    Global Security Computing Applications Division, Lawrence Livermore National Laboratory , Fort Detrick, MD, USA

抄録

<jats:title>Abstract</jats:title> <jats:sec> <jats:title>Summary</jats:title> <jats:p>To address the need for improved phage annotation tools that scale, we created an automated throughput annotation pipeline: multiple-genome Phage Annotation Toolkit and Evaluator (multiPhATE). multiPhATE is a throughput pipeline driver that invokes an annotation pipeline (PhATE) across a user-specified set of phage genomes. This tool incorporates a de novo phage gene calling algorithm and assigns putative functions to gene calls using protein-, virus- and phage-centric databases. multiPhATE’s modular construction allows the user to implement all or any portion of the analyses by acquiring local instances of the desired databases and specifying the desired analyses in a configuration file. We demonstrate multiPhATE by annotating two newly sequenced Yersinia pestis phage genomes. Within multiPhATE, the PhATE processing pipeline can be readily implemented across multiple processors, making it adaptable for throughput sequencing projects. Software documentation assists the user in configuring the system.</jats:p> </jats:sec> <jats:sec> <jats:title>Availability and implementation</jats:title> <jats:p>multiPhATE was implemented in Python 3.7, and runs as a command-line code under Linux or Unix. multiPhATE is freely available under an open-source BSD3 license from https://github.com/carolzhou/multiPhATE. Instructions for acquiring the databases and third-party codes used by multiPhATE are included in the distribution README file. Users may report bugs by submitting to the github issues page associated with the multiPhATE distribution.</jats:p> </jats:sec> <jats:sec> <jats:title>Supplementary information</jats:title> <jats:p>Supplementary data are available at Bioinformatics online.</jats:p> </jats:sec>

収録刊行物

  • Bioinformatics

    Bioinformatics 35 (21), 4402-4404, 2019-05-14

    Oxford University Press (OUP)

被引用文献 (1)*注記

もっと見る

問題の指摘

ページトップへ