SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models
書誌事項
- 公開日
- 2024-07-25
- 資源種別
- journal article
- 権利情報
-
- https://creativecommons.org/licenses/by/4.0
- https://creativecommons.org/licenses/by/4.0
- DOI
-
- 10.1186/s13059-024-03298-4
- 公開者
- Springer Science and Business Media LLC
説明
<jats:title>Abstract</jats:title> <jats:p> Accurate inference of orthologous genes constitutes a prerequisite for comparative and evolutionary genomics. SonicParanoid is one of the fastest tools for orthology inference; however, its scalability and accuracy have been hampered by time-consuming all-versus-all alignments and the existence of proteins with complex domain architectures. Here, we present a substantial update of SonicParanoid, where a gradient boosting predictor halves the execution time and a language model doubles the recall. Application to empirical large-scale and standardized benchmark datasets shows that SonicParanoid2 is much faster than comparable methods and also the most accurate. SonicParanoid2 is available at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://gitlab.com/salvo981/sonicparanoid2">https://gitlab.com/salvo981/sonicparanoid2</jats:ext-link> and <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://zenodo.org/doi/10.5281/zenodo.11371108">https://zenodo.org/doi/10.5281/zenodo.11371108</jats:ext-link> . </jats:p>
収録刊行物
-
- Genome Biology
-
Genome Biology 25 (1), 2024-07-25
Springer Science and Business Media LLC
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1360588380136480768
-
- ISSN
- 1474760X
-
- 資料種別
- journal article
-
- データソース種別
-
- Crossref
- KAKEN
- OpenAIRE