Zea mays (maize)



About the genome:


Overview

This release of phytozome includes the 4a.53 assembly and annotation of the maize "B73" genome produced by the Maize Genome Project, and the initial alignment of "Mo17" 454 reads to this reference sequence.

The browser displays:

  1. Gene structure predictions from the 4a.53 release of maizesequence.org (see below)
  2. Alignments of maize ESTs from genbank
  3. Alignment of grass transcript assemblies (i.e., assemblies of ESTs) from PlantTA
  4. Alignment of 454 shotgun reads from the Mo17 project
  5. Alignment of an initial Newbler assembly of low-repeat-content 454 reads from the Mo17 project

For more information on the Mo17 project, please see Maize Page.

Important note on data sources

The B73 assembly "pseudomolecules" used in the 4a.53 release represent the first chromosome-scale assembly of the Maize Genome Project. They were produced by the Arizona Genome Institute based on a curated tiling path and over 16,000 BACs sequenced at the WashU Genome Sequencing Center and partially finished by the Maize Genome Project collaboration. This is the fourth freeze of the Maize Genome Project. More information can be found at http://www2.genome.arizona.edu/genomes/maize and maizesequence.org.

The predicted gene set is the "working gene set" release 4a.53 (27 June 2009), produced by Cold Spring Harbor and made available at maizesequence.org (http://ftp.maizesequence.org/current/). Please consult www.maizesequence.org for more details about this release and future plans.

For more information on the B73 Maize Genome Project, please see www.maizesequence.org/overiew.html.

Please note that the B73 assembly and annotation used here are subject to the "Data Use Policy" of the Maize Genome Project, reproduced here:

Our goal is to provide high quality sequence information to the research community in a timely manner. Accordingly, individual sequence read traces are submitted to the NCBI Trace Archive as soon as they have exited our quality control pipeline. Whole genome sequence assemblies are released as soon as possible following appropriate quality analysis. Our archive site contains draft versions of the genome sequence assemblies, and we ask that you understand that these represent preliminary data, subject to omissions and errors. In addition, whole genome assemblies are likely to change upon the availability of new data, and our website will document new assembly versions as they are released.

In recognition of the extensive effort that underlies these genome sequencing projects, we ask that you appropriately acknowledge the use of any preliminary data. We offer the following example for acknowledgement: "These data were produced by the Genome Sequencing Center at Washington University School of Medicine in St. Louis and can be obtained from ftp://genome.wustl.edu/pub/xxx", where xxx refers to the appropriate ftp directory from which the data has been obtained. Our official web address may also be used. This recommendation is in accordance with the adopted guidelines by the genome sequencing community in a statement of principles for the distribution and use of large-scale sequencing data: Community Resource Projects and the resulting NHGRI policy statement. If you have any questions regarding the use of this data, please contact us at web address: webmaster@genome.wustl.edu . We request that you contact the Director of the Sequencing Center, Richard Wilson, before publishing analyses of the sequence on a chromosome or genome scale. We welcome collaborative interaction to provide the community with improved whole genome analyses and annotations.

The Maize genome has been released in pre-publication status from the Maize Sequence Consortium. This is provided freely to be used by anyone, but they have requested that the scientific ethics of other groups publishing on this pre-publication data are respected. This is outlined in detail in the Fort Lauderdale agreement. In brief, small scale analysis, e.g., the analysis of a single locus is an expected use of the data which can be published on without any expectation of coordination. In contrast, large scale, genome-wide analysis is expected to be either coordinated with the Maize Sequence Consortium in some manner or published after the initial paper. More details on the reasoning for this and details are given in the Fort Lauderdale document.

Other tracks on the Phytozome maize B73 browser are:

  1. "Repeats" is a de novo masking of the genome based on 16-bp sequences that are overrepresented in the genome.
  2. 1,801,510 Zea mays ESTs from Genbank, aligned using the Program to Assembly Spliced Alignments (PASA, Haas et al.)
  3. Zea mays "transcript assemblies" (i.e., assembled ESTs) obtained from the PlantTA project (plantta.jcvi.org/)
  4. Alignment of peptides from other plants in the Phytozome database (including sorghum, rice, brachypodium?), using BLATX (Kent)
  5. Alignment of 454 shotgun reads from Mo17. "selected" reads are 16M reads with less than 1/3 of their length masked. These are aligned to the unmasked genome. The "65M" set is a the full ~10x Mo17 shotgun dataset, aligned to the hard-masked genome. Note that only the longest aligning stretch is shown; these alignments terminate either at the end of a read or at a masked position.
©2010 University of California Regents. All rights reserved