Convergent evolution and predictability of gene copy numbers associated with diets in mammals
-
- Kitano, Jun
- 作成者
メタデータ
- 公開日
- 2023-12-06
- DOI
-
- 10.5061/dryad.q2bvq83r2
- 公開者
- Dryad
- データ作成者 (e-Rad)
-
- Kitano, Jun
説明
This README file was generated on 2023-11-28 by Jun Kitano. GENERAL INFORMATION 1. Title of Dataset: Analysis of convergent evolution and predictability of gene copy numbers associated with diets in mammals 2. Author Information Principal Investigator Contact Information Name: Jun Kitano Institution: National Institute of Genetics Address: Mishima, Shizuoka, Japan Email: [jkitano@nig.ac.jp](mailto:jkitano@nig.ac.jp) 3. Date of data collection: 2022-2023 4. Geographic location of data collection: Mishima, Japan 5. Information about funding sources that supported the collection of the data: NIG Summer Internship Program, JSPS Kakenhi (23KJ0483, 22H04925, and 22H04983), JST CREST (JPMJCR19S2 and JPMJCR20S2), and MEXT (JPMXD1521474594). SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain 2. Links to publications that cite or use the data: Wilhoit, K., Yamanouchi, S., Chen. B.Y., Yamasaki., Y.Y., Ishikawa, A., Inoue, J., Iwasaki, W., and Kitano, J.(2024). Convergent evolution and predictability of gene copy numbers associated with diets in mammals. 1. Links to other publicly accessible locations of the data: None 2. Links/relationships to ancillary data sets: None 3. Was data derived from another source? Yes A. If yes, list source(s): Dryad database (doi:10.5061/dryad.qd450 and doi.org/10.5061/dryad.tb03d03) and an original paper (doi: 10.1098/rspb.2014.2103). 4. Recommended citation for this dataset: Gainsbury AM, Tallowin OJS, Meiri S. 2018. An updated global data set for diet preferences in terrestrial mammals: testing the validity of extrapolation. Mamm. Rev. 48:160–167. doi:10.5061/dryad.qd450. Upham NS, Esselstyn JA, Jetz W. 2019. Inferring the mammal tree: Species-level sets of phylogenies for questions in ecology, evolution, and conservation. PLoS Biol. 17:e3000494. doi.org/10.5061/dryad.tb03d03. Tucker MA, Rogers TL. 2014. Examining predator-prey body size, trophic level and body mass across marine and terrestrial mammals. Proc. Biol. Sci. 281. doi: 10.1098/rspb.2014.2103. DATA & FILE OVERVIEW 1. File List: A) herbivore.tsv B) carnivore.tsv C) omnivore.tsv D) Data_Preparation_Script_Median.R E) Mammal_Data.xlsx F) Phylogenetic_Tree_Script_Median.R G) RAxML_bipartitions.result_FIN4_raw_rooted_wBoots_4098mam1out_OK.newick H) PhyloTree.newick I) Phylogenetic_Correction_Script_Median.R J) DFA.R K) Corrected DFA Table.csv 1. Relationship between files, if important: None 2. Additional related data collected that was not included in the current data package: None 3. Are there multiple versions of the dataset? Yes A. If yes, name of file(s) that was updated: PFDA.R PhyloTree_ultra.newick phylo_dfa_selected.csv go_random_select.R herbivore.btaurus.csv omnivore.hsapiens.csv i. Why was the file updated? For the revision of the paper, we conducted additional analyses. ii. When was the file updated? 2024 Oct 10th \######################################################################### DATA-SPECIFIC INFORMATION FOR: herbivore.tsv, carnivore.tsv, and omnivore.tsv 1. Number of variables: 348 2. Number of cases/rows: 77 (herbivore.tsv), 19 (carnivore.tsv), and 61 (omnivore.tsv) 3. Variable List: group_id: orthologous gene ID pMCMC: significance test after phylogenetic correction positive.median: median copy number of the target category (eg. herbivore) negative.median: median copy number of the non-target category (eg. non-herbivore) ratio: Ratio of the median copy number of the target category divided by that of the non-target category *.tag: protein sequence identification number *.len: translated amino acid length *.prod: gene product *.gcn: gene copy number ## Abbreviations of species names Acijub*: Acinonyx jubatus Ailmel*: Ailuropoda melanoleuca Balacusca*: Balaenoptera acutorostrata Bisbisbis*: Bison bison Bosmut*: Bos mutus Bostar*: Bos taurus Bubbub*: Bubalus bubalis Caljac*: Callithrix jacchus Calurs*: Callorhinus ursinus Cambac*: Camelus bactrianus Camdro*: Camelus dromedarius Canlupfam*: Canis lupus Caphir*: Capra hircus Carsyr*: Carlito syrichta Cersimsim*: Ceratotherium simum Chlsab*: Chlorocebus sabaeus Chrasi*: Chrysochloris asiatica Colangpal*: Colobus angolensis palliatus Concri*: Condylura cristata Dasnov*: Dasypus novemcinctus Delleu*: Delphinapterus leucas Desrot*: Desmodus rotundus Echtel*: Echinops telfairi Eleedw*: Elephantulus edwardii Enhlutken*: Enhydra lutris Eptfus*: Eptesicus fuscus Equasi*: Equus asinus asinus Equcab*: Equus caballus Equprz*: Equus przewalskii Erieur*: Erinaceus europaeus Eumjub*: Eumetopias jubatus Felcat*: Felis catus Glomel*: Globicephala melas Gorgorgor*: Gorilla gorilla Hiparm*: Hipposideros armiger Homsap*: Homo sapiens Lagobl*: Lagenorhynchus obliquidens Lepwed*: Leptonychotes weddellii Lipvex*: Lipotes vexillifer Loxafr*: Loxodonta africana Lyncan*: Lynx canadensis Macmul*: Macaca mulatta Manjav*: Manis javanica Minnat*: Miniopteru ...
Convergent evolution, the evolution of the same or similar phenotypes in phylogenetically independent lineages, is a widespread phenomenon in nature. If the genetic basis for convergent evolution is predictable to some extent, it may be possible to infer organismic phenotypes and adaptability based on genome sequence data. While repeated amino acid changes have been studied in association with convergent evolution, relatively little is known about the potential contribution of repeated gene copy number changes. In this study, we explore whether certain gene copy number changes are linked to diet shifts in mammals and assess if trophic ecology can be inferred from the copy numbers of a specific set of genes. Using 86 mammalian genome sequences, we identified several genes with higher copy numbers in herbivores, carnivores, and omnivores, even after phylogenetic corrections. We were able to confirm previous findings on genes such as amylase, olfactory receptor, and xenobiotic metabolism genes, and identify novel genes whose copy numbers correlate with dietary patterns. For example, omnivores exhibited higher copy numbers of genes encoding gene expression regulators. We also established a discriminant function based on the copy numbers of 13 genes that can help predict trophic ecology based on genome sequence data. These findings highlight a possible association between convergent evolution and repeated copy number changes in specific genes, suggesting the potential to develop a method for predicting animal ecology and adaptability from genome sequence data.
The publicly available data were used. The scripts for processing the data are uploaded here.