DOBLAST: The Cultivated Alfalfa at the Diploid Level (CADL) Genome Blast Server   
Location:  Home


Readme file (CADL)

12 September 2017

Description of HM342 Medicago sativa Cultivated Alfalfa at the Diploid Level* (CADL) v1.0 genome data

Funding: Medicago HapMap project (NSF Project IOS-1237993)

Sequencing: National Center for Genome Resources (NCGR)

Assembly and Analysis: NCGR, Noble Foundation, J. Craig Venter Institute, University of Minnesota

Participants:

Joann Mudge, Nicholas P. Devitt , Diego A. Fajardo, Thiru Ramaraj, Andrew D. Farmer, Xinbin Dai, Zhaohong Zhuang, Peng Zhou, Joseph Guhlin, Christopher D. Town, Maria J. Monteros, Patrick X. Zhao, Jason R. Miller, Kevin A. T. Silverstein, Nevin D. Young


LIST OF FILES

medsa.CADL_HM342.v1.0.fasta.gz

DISCLAIMER

This assembly is provided as is for the community with no claims on the quality or completeness of the sequence or gene coverage. Please be aware that: Medicago sativa is a highly heterozygous organism with an expected haploid genome size of 800 Mb. The sequence similarity of the two haplotypes in the diploid CADL varies, often diverging enough from each other that they are assembled separately. This has resulted in an assembly size of ~1200 Mb rather than the expected 800 Mb, suggesting that at least half of the genome is represented by the assembly by two distinct haplotypes. In regions of the most divergence, presence/absence differences in gene content can be seen between the haplotypes. This implies that any attempts to remove redundancies in the assembly to retain only one haplotype would result in gene loss. This also implies that the gene content of the current assembly contains a significant proportion of allelic copies of genes, a supposition that is confirmed by both alignment to the related Medicago truncatula genome and by analysis of genes that are typically found in single copies in plant genomes.

RESTRICTIONS ON USE

The CADL assemblies available here, including the previous and current version, are made available to the research community by the Medicago HapMap consortium under the Toronto Agreement [ http://www.nature.com/nature/journal/v461/n7261/full/461168a.html]. As producers of these data, we reserve the right to be the first to publish a genome-wide analysis of the data.

The pre-publication data released here is embargoed for publication except for analyses of single gene loci or small (< 10 kb) genome regions. Researchers are encouraged to contact us if there are queries about referencing or publishing analyses based on the pre-publication data obtained via this website. Researchers are also invited to consider collaborations with the Medicago Hapmap consortium for larger studies or if the limitations here restrict further work.


CADL SOURCE MATERIAL (Renamed HM342 as part of the Hapmap project)

A single plant was clonally propagated at the University of Minnesota. DNA was isolated by Amplicon Express in February, 2015.

CADL ASSEMBLY VERSION v1.0

This assembly was generated with ~100X PacBio Reads (based on the haploid genome size of 800 Mb) and Dovetail HiRise scaffolding. Note that Dovetail does not size gaps but adds in a string of 100 Ns. The assembly statistics are described in the following table:

HM Number Name Chemistry Mean subread length Subread N50 Subread total length Number of subreads Max length Coverage
HM342 CADL P6-C4 8,034 13,070 81,270,428,779 9,013,313 52,329 101.59


Falcon v. 0.4 was used for correction and assembly followed by Quiver polishing. An additional round of Quiver polishing was performed after integrating Dovetail HiRise scaffolding. The assembly statistics are in the table below:

CADL Assembly
Assembler Falcon-Quiver-SSPACE
Contigs 6,921
Max Contig 2,901,187
Mean Contig 180,749
Contig N50 694,594
Total Contig Length 1,250,961,487
Scaffolds 5,753
Max Scaffold 6,073,685
Mean Scaffold 217,463
Scaffold N50 1,271,357
Total Scaffold Length 1,251,062,122
Captured Gaps 1,168


References:

  1. Bingham ET and McCoy TJ (1979) Cultivated Alfalfa at the Diploid Level: Origin, Reproductive Stability, and Yield of Seed and Forage. Crop Science 19: 97-100.
  2. Chin C, et al. 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods. 10:563569.
  3. Chin J. 2015. FALCON: experimental PacBio diploid assembler. https://github.com/PacificBiosciences/FALCON.
  4. Myers G. 2014. The Daligner Overlap Library. https://github.com/thegenemyers/DALIGNER.
  5. PacBio® variant consensus caller (Quiver algorithm). https://github.com/PacificBiosciences/GenomicConsensus.
  6. Putnam NH, O'Connell BL, Stites JC, Rice BJ, Blanchette M, Calef R, Troll CJ, Fields A, Hartley PD, Sugnet CW et al: Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res 2016.

Copyright © 2017, Noble Research Institute, LLC.