Ensembl data is released on an approximately two-month cycle (occasionally longer if a lot of development work is being undertaken). Whatever its length, the cycle works as follows:
This stage varies in length for each species, as it takes longer if
the genome is more complex complexity of the genome(s)
involved. Individual species are updated on an irregular schedule,
depending on the availability of new assemblies and evidence. New
species are added frequently from a number of sequencing projects
around the world, and all species databases may receive minor
updates. These can include patches to correct erroneous data and
updates to data that changes regularly (such as cDNAs for human and
mouse).
The genebuild team members take evidence for genes and transcripts, such as protein and mRNAs, and combine it with manual annotation data in the analysis pipeline, to create an Ensembl core database and optionally otherfeatures and cdna databases. Once these are complete, they are handed over to the other Ensembl data teams for further processing (see below).
Whilst the genomic data is being prepared, the web team works on new displays and new website features. They then bring together all the finished databases and make the content available online in a number of ways:
The web team also populates an additional database, ensembl_website, which contains help, news, and other web-specific information. If there are new displays, or if existing ones have changed substantially, the outreach team update the help content.
This is necessarily a simplified account of a process that takes around 50 people several months to complete!