High-Quality <i>Arabidopsis Thaliana</i> Genome Assembly with Nanopore and HiFi Long Reads

  • Bo Wang
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Xiaofei Yang
    School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Yanyan Jia
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Yu Xu
    School of Life Science and Technology, Xi’an Jiaotong University , Xi’an 710049 , China
  • Peng Jia
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Ningxin Dang
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Songbo Wang
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Tun Xu
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Xixi Zhao
    Genome Institute, the First Affiliated Hospital of Xi’an Jiaotong University , Xi’an 710061 , China
  • Shenghan Gao
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China
  • Quanbin Dong
    Genome Institute, the First Affiliated Hospital of Xi’an Jiaotong University , Xi’an 710061 , China
  • Kai Ye
    MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University , Xi’an 710049 , China

抄録

<jats:title>Abstract</jats:title> <jats:p>Arabidopsis thaliana is an important and long-established model species for plant molecular biology, genetics, epigenetics, and genomics. However, the latest version of reference genome still contains a significant number of missing segments. Here, we reported a high-quality and almost complete Col-0 genome assembly with two gaps (named Col-XJTU) by combining the Oxford Nanopore Technologies ultra-long reads, Pacific Biosciences high-fidelity long reads, and Hi-C data. The total genome assembly size is 133,725,193 bp, introducing 14.6 Mb of novel sequences compared to the TAIR10.1 reference genome. All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality (QV) scores &gt; 60 (ranging from 62 to 68), which are higher than those of the TAIR10.1 reference (ranging from 45 to 52). We completely resolved chromosome (Chr) 3 and Chr5 in a telomere-to-telomere manner. Chr4 was completely resolved except the nucleolar organizing regions, which comprise long repetitive DNA fragments. The Chr1 centromere (CEN1), reportedly around 9 Mb in length, is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats. Using the cutting-edge sequencing data and novel computational approaches, we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2. We also investigated the structure and epigenetics of centromeres. Four clusters of CEN180 monomers were detected, and the centromere-specific histone H3-like protein (CENH3) exhibited a strong preference for CEN180 Cluster 3. Moreover, we observed hypomethylation patterns in CENH3-enriched regions. We believe that this high-quality genome assembly, Col-XJTU, would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms, as well as the genetic and epigenetic features in plants.</jats:p>

収録刊行物

被引用文献 (1)*注記

もっと見る

問題の指摘

ページトップへ