The Ensembl Variation database stores two types of variation data, depending it comes from an external source or data it is calculated on site.
The tables of the database are described in the Variation Tables Description page.
The database schema is described in the following pdf.
The variations databases can be loaded using dumped files from the Ensembl
FTP.
For Ensembl 61, it takes a couple of minutes to load the largest tables for Human on our servers, e.g:
Table | Loading time |
---|---|
variation | 24 minutes |
variation_feature | 13 minutes |
flanking_sequence | 3 minutes |
compressed_genotype_single_bp | 13 minutes |
population_genotype | 30 minutes |
The load of the largest table in the Human variation database (allele table) takes almost 3 hours.
See below some settings of our server:
Variable | Value |
---|---|
myisam_data_pointer_size | 6 |
myisam_max_sort_file_size | 9223372036853727232 |
myisam_recover_options | OFF |
myisam_repair_threads | 1 |
myisam_sort_buffer_size | 67108864 |
myisam_stats_method | nulls_unequal |
myisam_use_mmap | OFF |
Ensembl is an open project and we would like to encourage correspondence and discussions on any subject on any aspect of Ensembl. Please see the Ensembl Contacts page for suitable options getting in touch with us.