Ensembl Schema Documentation
Introduction
This document gives a describes the tables that make up the Ensembl 'Funcgen' schema. Tables are grouped logically by their function, and the purpose of each table is explained. This document refers to version 62 of the Ensembl variation schema.
A simplified entity relationship diagram of the schema is available here.
List of the tables:
Main feature tables
Set tables
Array design tables
Experiment tables
Ancilliary tables
Core tables
Core like tables
Main feature tables
These define the various genomics features and their relevant associated tables.
regulatory_feature |
Show columns |
The table contains imports from externally curated resources e.g. cisRED, miRanda, VISTA, redFLY etc.
Column | Type | Default value | Description | Index |
regulatory_feature_id | int(10) | | Internal ID | primary key |
seq_region_id | int(10) | | seq_region table ID | key: seq_region_idx |
seq_region_start | int(10) | | Start position of this feature | key: seq_region_idx |
seq_region_end | int(10) | | End position of this feature | |
seq_region_strand | tinyint(1) | | Strand orientation of this feature | |
display_label | varchar(80) | NULL | Text display label | |
feature_type_id | int(10) | NULL | feature_type table ID | key: feature_type_idx |
feature_set_id | int(10) | NULL | feature_set table ID | key: feature_set_idx |
stable_id | mediumint(8) | NULL | Integer stable ID without ENSR prefix | key: stable_id_idx |
bound_seq_region_start | int(10) | | Bound start position of this feature | |
bound_seq_region_end | int(10) | | Bound end position of this feature | |
binary_string | varchar(255) | NULL | Binary representation for the underlying feature sets/types | |
projected | boolean | FALSE | Boolean, defines whether reg feat structure has been projected to this cell type | |
See also:
regulatory_attribute |
Show columns |
Denormalised table defining links between a @link regulatory_feature and it's constituent 'attribute' features.
Column | Type | Default value | Description | Index |
regulatory_feature_id | int(10) | | Internal ID | primary key |
attribute_feature_id | int(10) | | Table ID of attribute feature | primary key |
attribute_feature_table | enum('annotated', 'external', 'motif') | NULL | Table name of attribute feature | primary key |
See also:
annotated_feature |
Show columns |
Represents a genomic feature as the result of an analysis i.e. a ChIP or DNase1 peak call.
Column | Type | Default value | Description | Index |
annotated_feature_id | int(10) | | Internal ID | primary key |
seq_region_id | int(10) | | seq_region table ID | unique key: seq_region_feature_set_idx |
seq_region_start | int(10) | | Start position of this feature | unique key: seq_region_feature_set_idx |
seq_region_end | int(10) | | End position of this feature | |
seq_region_strand | tinyint(1) | | Strand orientation of this feature | |
display_label | varchar(60) | NULL | Text display label | |
score | double | NULL | Score derived from software | |
feature_set_id | int(10) | | feature_set table ID | unique key: seq_region_feature_set_idx key: feature_set_idx |
summit | int(10) | NULL | Represents peak summit for those analyses which provide it (e.g. Swembl) | |
See also:
motif_feature |
Show columns |
The table contains genomic alignments of @link binding_matrix PWMs.
Column | Type | Default value | Description | Index |
motif_feature_id | int(10) | | Primary key, internal ID | primary key |
binding_matrix_id | INT(10) | | Foreign key to binding_matrix table | key: binding_matrix_idx |
seq_region_id | int(10) | | Foreign key to seq_region table | key: seq_region_idx |
seq_region_start | int(10) | | Start position of this feature | key: seq_region_idx |
seq_region_end | int(10) | | End position of this feature | |
seq_region_strand | tinyint(1) | | Strand orientation of this feature | |
display_label | varchar(60) | NULL | Text display label | |
score | double | NULL | Score derived from alignment software (e.g.MOODS) | |
interdb_stable_id | mediumint(8) | NULL | Unique key, provides linkability between DBs | unique key: interdb_stable_id_idx |
See also:
associated_motif_feature |
Show columns |
The table provides links between motif_features and annotated_features representing peaks of the relevant transcription factor.
Column | Type | Default value | Description | Index |
annotated_feature_id | int(10) | | annotated_feature table ID | primary key |
motif_feature_id | int(10) | | motif_feature table ID | primary key key: motif_feature_idx |
See also:
binding_matrix |
Show columns |
Contains information defining a specific binding matrix(PWM) as defined by the linked analysis e.g. Jaspar.
Column | Type | Default value | Description | Index |
binding_matrix_id | INT(10) | | Internal table ID | primary key |
name | VARCHAR(45) | | Name of PWM | key: name_analysis_idx |
feature_type_id | int(10) | | feature_type table ID. | key: feature_type_idx |
frequencies | VARCHAR(1000) | | Matrix defining frequencing for each base at each position | |
description | VARCHAR(255) | | Text description | |
analysis_id | int(10) | | analysis table ID | key: name_analysis_idx |
See also:
external_feature |
Show columns |
The table contains imports from externally curated resources e.g. cisRED, miRanda, VISTA, redFLY etc.
Column | Type | Default value | Description | Index |
external_feature_id | int(10) | | Internal ID | primary key |
seq_region_id | int(10) | | seq_region table ID | key: seq_region_idx |
seq_region_start | int(10) | | Start position of this feature | key: seq_region_idx |
seq_region_end | int(10) | | End position of this feature | |
seq_region_strand | tinyint(1) | | Strand orientation of this feature | |
display_label | varchar(60) | NULL | Text display label | |
feature_type_id | int(10) | NULL | feature_type table ID | key: feature_type_idx |
feature_set_id | int(10) | | feature_set table ID | key: feature_set_idx |
See also:
result_feature |
Show columns |
Represents the mapping of a raw/normalised signal. This is optimised for the web display in two ways:
1 Data compression by collection into different sized windows or bins.
2 For array data it also provides an optimised view of a @link probe_feature and associated @link result.
Column | Type | Default value | Description | Index |
result_feature_id | int(10) | | Internal ID | key: result_feature_idx |
result_set_id | int(10) | | result_set table ID | key: set_window_seq_region_idx |
seq_region_id | int(10) | | Foreign key to seq_region table | key: set_window_seq_region_idx |
seq_region_start | int(10) | | Start position of this feature | key: set_window_seq_region_idx |
seq_region_end | int(10) | | End position of this feature | |
seq_region_strand | tinyint(4) | | Strand orientation of this feature | |
window_size | smallint(5) | | Size of window in base pairs | key: set_window_seq_region_idx |
scores | longblob | | BLOB of window scores for this region | |
See also:
probe_feature |
Show columns |
The table contains genomic alignments @link probe entries.
Column | Type | Default value | Description | Index |
probe_feature_id | int(10) | | Internal ID | primary key key: seq_region_probe_probe_feature_idx |
seq_region_id | int(10) | | Foreign key to seq_region table | key: seq_region_probe_probe_feature_idx |
seq_region_start | int(10) | | Start position of this feature | key: seq_region_probe_probe_feature_idx |
seq_region_end | int(10) | | End position of this feature | key: seq_region_probe_probe_feature_idx |
seq_region_strand | tinyint(4) | | Strand orientation of this feature | |
probe_id | int(10) | | probe table ID | key: probe_idx key: seq_region_probe_probe_feature_idx |
analysis_id | smallint(5) | | analysis table ID | |
mismatches | tinyint(4) | | Integer, the number of bp matches for this alignment | |
cigar_line | varchar(50) | NULL | Extended cigar line format representation of the alignment as defined here http://samtools.sourceforge.net/SAM-1.3.pdf. In summary: = Seq/Alignment Match M Alignment match/Seq mismatch X Seq/Alignment mismatch D Deletion S Soft clipping, used for overhanging cdna alignments where genomic seq is unknown | |
See also:
Set tables
Sets are containers for distinct sets of raw and/or processed data.
Defines highest level data container for associating the result of an analysis and the input data to that analysis e.g. Seq alignments(Input/ResultSet) and peak calls (FeatureSet)
Column | Type | Default value | Description | Index |
data_set_id | int(10) | | Internal ID | primary key |
feature_set_id | int(10) | '0' | Product feature_set table ID | primary key |
name | varchar(100) | NULL | Name of data set | unique key: name_idx |
See also:
supporting_set |
Show columns |
Defines association between @link data_set and underlying/supporting data.
Column | Type | Default value | Description | Index |
data_set_id | int(10) | | Internal ID | primary key |
supporting_set_id | int(10) | | Table ID of supporting set | primary key |
type | enum('result','feature','input') | NULL | Type of supporting set e.g. result, feature or input set. | key: type_idx |
See also:
Container for genomic features defined by the result of an analysis e.g. peaks calls or regulatory features.
Column | Type | Default value | Description | Index |
feature_set_id | int(10) | | Internal ID | primary key |
feature_type_id | int(10) | | Table ID for feature_type | key: feature_type_idx |
analysis_id | smallint(5) | | Table ID for analysis | |
cell_type_id | int(10) | NULL | Table ID for cell_type | |
name | varchar(100) | NULL | Name for this feature set | unique key: name_idx |
type | enum('annotated', 'regulatory', 'external') | NULL | Type of features contained e.g. annotated, external or regualtory | |
description | varchar(80) | NULL | Text description | |
display_label | varchar(80) | NULL | Shorter more readable version of name | |
See also:
Container for raw/signal data, used as input to an analysis or for visualisation of the raw signal i.e. a wiggle track.
Column | Type | Default value | Description | Index |
result_set_id | int(10) | | Internal ID | primary key |
analysis_id | smallint(5) | | Table ID for analysis | unique key: unique_idx |
name | varchar(100) | NULL | Name for this feature set | unique key: unique_idx |
feature_type_id | int(10) | NULL | Table ID for feature_type | unique key: unique_idx |
cell_type_id | int(10) | NULL | Table ID for cell_type | unique key: unique_idx |
See also:
result_set_input |
Show columns |
Link table between @link result_set and it's contstituents which can vary between an array experiment (@link experimental_chip/@link channel) and a sequencing experiment (@link input_set). Note the joint primary key as inputs can be re-used between result sets.
See also:
dbfile_registry |
Show columns |
This generic table contains a simple registry of paths to support flat file (DBFile) access. This should be left joined from the relevant adaptor e.g. ResultSetAdaptor.
Column | Type | Default value | Description | Index |
table_id | int(10) | | Primary key of linked dbfile entity e.g. result_set or @link analysis | primary key |
table_name | varchar(32) | | Name of linked table | primary key |
path | varchar(255) | | Either a full filepath or a directory which the API will use to build the filepath | |
See also:
Defines a distinct set input data which is not imported into the DB, but used for some analysis e.g. a BAM file.
See also:
input_subset |
Show columns |
Defines a file from an input_set, required for import tracking and recovery.
See also:
Array design tables
Contains information defining an array or array set.
Column | Type | Default value | Description | Index |
array_id | int(10) | | Internal ID | primary key |
name | varchar(40) | NULL | Name of array | unique key: vendor_name_idx unique key: class_name_idx |
format | varchar(20) | NULL | Format of array e.g. EXPRESSION, TILED | |
vendor | varchar(40) | NULL | Name of array vendor e.g. AFFY | unique key: vendor_name_idx |
description | varchar(255) | NULL | Text description | |
type | varchar(20) | NULL | Array type e.g. OLIGO, PCR | |
class | varchar(20) | NULL | Array class e.g. AFFY_ST, ILLUMINA_INFINIUM | unique key: class_name_idx |
See also:
Represents the individual array chip design as part of an array or array set.
Column | Type | Default value | Description | Index |
array_chip_id | int(10) | | Internal ID | primary key |
design_id | varchar(20) | NULL | ID/Accession defined by vendor | unique key: array_design_idx |
array_id | int(10) | | Table ID from array | unique key: array_design_idx |
name | varchar(40) | NULL | Name of array_chip | |
See also:
The table contains information about probe sets.
Column | Type | Default value | Description | Index |
probe_set_id | int(10) | | Internal ID | primary key |
name | varchar(100) | | Name of the probe set | key: name |
size | smallint(6) | | Integer size of the probe set i.e. how many probe is contains | |
family | varchar(20) | NULL | Generic descriptor for probe_set e.g. ENCODE_REGIONS, RANDOM etc. Currently not used | |
See also:
Defines individual probe designs across one or more array_chips. Note: The probe sequence is not stored.
Column | Type | Default value | Description | Index |
probe_id | int(10) | | Tnternal ID | primary key |
probe_set_id | int(10) | NULL | probe_set table_id | key: probe_set_idx |
name | varchar(100) | | Name of the probe set | primary key key: name_idx |
length | smallint(6) | | Integer bp length of the probe | |
array_chip_id | int(10) | | array_chip table_id | primary key key: array_chip_idx |
class | varchar(20) | NULL | Class of the probe e.g. CONTROL, EXPERIMENTAL etc. | |
description | varchar(255) | NULL | Text description | |
See also:
probe_design |
Show columns |
Stores data from array design analyses.
Column | Type | Default value | Description | Index |
probe_id | int(10) | | Internal ID | primary key |
analysis_id | smallint(5) | | analysis table ID | primary key |
coord_system_id | int(10) | | coord_system table ID | primary key |
score | double | NULL | Double analysis score value | |
See also:
Experiment tables
These define the experimental meta and raw data .
Stores data high level meta data about individual experiments
Column | Type | Default value | Description | Index |
experiment_id | int(10) | | Internal ID | primary key |
name | varchar(100) | NULL | Name of experiment | unique key: name_idx |
experimental_group_id | smallint(6) | NULL | experimental_group table ID | key: experimental_group_idx |
date | date | '0000-00-00' | Date of experiment | |
primary_design_type | varchar(30) | NULL | e.g. binding_site_identification, preferably EFO term | key: design_idx |
description | varchar(255) | NULL | Text description | |
mage_xml_id | int(10) | NULL | mage_xml table_id for array experiments | |
See also:
experimental_group |
Show columns |
Contains experimental group info i.e. who produced data sets.
Column | Type | Default value | Description | Index |
experimental_group_id | smallint(6) | | Internal ID | primary key |
name | varchar(40) | | Name of group | unique key: name_idx |
location | varchar(120) | NULL | Geographic location of group | |
contact | varchar(40) | NULL | Contact details e.g. email | |
description | varchar(255) | NULL | Text description | |
See also:
Contains MAGE-XML for array based experiments.
Column | Type | Default value | Description | Index |
mage_xml_id | int(10) | | Internal table ID | primary key |
xml | text | | XML text field | |
See also:
experimental_chip |
Show columns |
Represents the physical instance of an @link array_chip used in an @link experiment.
Column | Type | Default value | Description | Index |
experimental_chip_id | int(10) | | Internal ID | primary key |
unique_id | varchar(20) | | Unique ID assigned by vendor | key: unique_id_idx |
experiment_id | int(10) | NULL | experiment table ID | key: experiment_idx |
array_chip_id | int(10) | NULL | array_chip table ID | |
feature_type_id | int(10) | NULL | feature_type table ID | key: feature_type_idx |
cell_type_id | int(10) | NULL | cell_type table ID | |
biological_replicate | varchar(100) | NULL | Name of biological replicate | |
technical_replicate | varchar(100) | NULL | Name of technical replicate | |
See also:
Represents an individual channel from an @link experimental_chip.
Column | Type | Default value | Description | Index |
channel_id | int(10) | | Internal ID | primary key |
experimental_chip_id | int(10) | NULL | external_chip table ID | key: experimental_chip_idx |
sample_id | varchar(20) | NULL | Sample ID | |
dye | varchar(20) | NULL | Name of dye used for this channel e.g. Cy3, Cy5 | |
type | varchar(20) | NULL | Type of channel i.e. EXPERIMENTAL or TOTAL (input) | |
See also:
Contains a score or intensity value for an associated probe location on a particular @experimental_chip.
Column | Type | Default value | Description | Index |
result_id | int(10) | | Internal ID | primary key |
probe_id | int(10) | NULL | probe table ID | key: probe_idx |
score | double | NULL | Intensity value (raw or normalised) | |
result_set_input_id | int(10) | | result_set_input table ID | key: result_set_input_idx |
X | smallint(4) | NULL | X coordinate of probe location on experimental_chip | |
Y | smallint(4) | NULL | Y coordinate of probe location on experimental_chip | |
See also:
Ancilliary tables
These contain data types which are used across many of the above tables and are quite often denormalised to store generic associations to several table, this avoids the need for multiple sets of similar tables.
feature_type |
Show columns |
Contains information about different types/classes of feature e.g. Brno nomenclature, Transcription Factor names etc.
Column | Type | Default value | Description | Index |
feature_type_id | int(10) | | Primary key, internal ID | primary key |
name | varchar(40) | | Name of feature_type | unique key: name_class_idx |
class | enum('Insulator', 'DNA', 'Regulatory Feature', 'Histone', 'RNA', 'Polymerase', 'Transcription Factor', 'Transcription Factor Complex', 'Regulatory Motif', 'Enhancer', 'Expression', 'Pseudo', 'Open Chromatin', 'Search Region', 'Association Locus') | NULL | Class of feature_type | unique key: name_class_idx |
description | varchar(255) | NULL | Text description | |
so_accession | varchar(64) | NULL | Sequence ontology accession | key: so_accession_idx |
so_name | varchar(255) | NULL | Sequence ontology name | |
See also:
associated_feature_type |
Show columns |
Link table providing many to many mapping for @link feature_type entries.
Column | Type | Default value | Description | Index |
table_id | int(10) | | Internal table_id of linked table | primary key |
table_name | enum('annotated_feature', 'external_feature', 'regulatory_feature', 'feature_type') | NULL | Name of linked table | primary key |
feature_type_id | int(10) | | Internal table_id of linked feature_type | primary key key: feature_type_index |
See also:
Contains information about cell/tissue types.
Column | Type | Default value | Description | Index |
cell_type_id | int(10) | | Internal ID | primary key |
name | varchar(120) | | Name of cell/tissue | unique key: name_idx |
display_label | varchar(20) | NULL | Short display label | |
description | varchar(80) | NULL | Text description | |
gender | enum('male', 'female') | NULL | Gender i.e. male or female | |
See also:
experimental_design |
Show columns |
Denormalised link table to allow many to many design_type associations.
Column | Type | Default value | Description | Index |
design_type_id | int(10) | | design_type table ID | primary key |
table_name | varchar(40) | NULL | Name of linked table | primary key |
table_id | int(10) | NULL | Table ID of linked record | primary key |
See also:
Contains extra information about experimental designs, preferably ontology terms.
Column | Type | Default value | Description | Index |
design_type_id | int(10) | | Internal ID | primary key |
name | varchar(255) | NULL | Name of design type | key: design_name_idx |
See also:
Denormalised table associating funcgen records with a status.
Column | Type | Default value | Description | Index |
table_id | int(10) | NULL | Table ID of associated record | primary key |
table_name | varchar(32) | NULL | Table name of associated record | primary key |
status_name_id | int(10) | | status_name table ID | primary key |
See also:
Simple table to predefine name of status.
Column | Type | Default value | Description | Index |
status_name_id | int(10) | | Internal ID | primary key |
name | varchar(20) | NULL | Name of status e.g. IMPORTED, DISPLAYBLE etc. | unique key: status_name_idx |
See also:
Core tables
These are exact clones of the corresponding core schema tables. See core schema docs for more details.
Usually describes a program and some database that together are used to create a feature on a piece of sequence. Each feature is marked with an analysis_id. The most important column is logic_name, which is used by the webteam to render a feature correctly on contigview (or even retrieve the right feature). Logic_name is also used in the pipeline to identify the analysis which has to run in a given status of the pipeline. The module column tells the pipeline which Perl module does the whole analysis, typically a RunnableDB module.
Column | Type | Default value | Description | Index |
analysis_id | smallint(5) | | Internal ID | primary key |
created | datetime | '0000-00-00 | Date to distinguish newer and older versions off the same analysis. | |
logic_name | varchar(100) | | String to identify the analysis. Used mainly inside pipeline. | unique key: logic_name key: logic_name_idx |
db | varchar(120) | NULL | Database name. | |
db_version | varchar(40) | NULL | Database version. | |
db_file | varchar(120) | NULL | File system location of the database. | |
program | varchar(80) | NULL | The binary used to create a feature. | |
program_version | varchar(40) | NULL | The binary version. | |
program_file | varchar(80) | NULL | File system location of the binary. | |
parameters | text | | A parameter string which is processed by the perl module. | |
module | varchar(80) | NULL | Perl module names (RunnableDBS usually) executing this analysis. | |
module_version | varchar(40) | NULL | Perl module version. | |
gff_source | varchar(40) | NULL | How to make a gff dump from features with this analysis. | |
gff_feature | varchar(40) | NULL | How to make a gff dump from features with this analysis. | |
See also:
analysis_description |
Show columns |
Allows the storage of a textual description of the analysis, as well as a "display label", primarily for the EnsEMBL web site.
Column | Type | Default value | Description | Index |
analysis_id | smallint(5) | | Foreign key references to the analysis table. | unique key: analysis_idx |
description | text | | Textual description of the analysis. | |
display_label | varchar(255) | | Display label for the EnsEMBL web site. | |
displayable | BOOLEAN | '1' | Flag indicating if the analysis description is to be displayed on the EnsEMBL web site. | |
web_data | text | | Other data used by the EnsEMBL web site. | |
See also:
Stores data about the data in the current schema. Unlike other tables, data in the meta table is stored as key-value pairs. These data include details about the database, RegulatoryBuild and patches. The species_id field of the meta table is used in multi-species databases and makes it possible to have species-specific meta key-value pairs. The species-specific meta key-value pairs needs to be repeated for each species_id. Entries in the meta table that are not specific to any one species, such as the schema.version key and any other schema-related information must have their species_id field set to NULL . The default species_id, and the only species_id value allowed in single-species databases, is 1.
Describes which co-ordinate systems the different feature tables use.
See also:
identity_xref |
Show columns |
Describes how well a particular xref object matches the EnsEMBL object.
Column | Type | Default value | Description | Index |
object_xref_id | INT(10) | | Foreign key references to the object_xref table. | primary key |
xref_identity | INT(5) | | Percentage identity. | |
ensembl_identity | INT(5) | | Percentage identity. | |
xref_start | INT | | Xref sequence start. | |
xref_end | INT | | Xref sequence end. | |
ensembl_start | INT | | Ensembl sequence start. | |
ensembl_end | INT | | Ensembl sequence end. | |
cigar_line | TEXT | | Used to encode gapped alignments. | |
score | DOUBLE | | Match score. | |
evalue | DOUBLE | | Match evalue. | |
See also:
external_synonym |
Show columns |
Some xref objects can be referred to by more than one name. This table relates names to xref IDs.
Column | Type | Default value | Description | Index |
xref_id | INT(10) | | Foreign key references xref table | primary key |
synonym | VARCHAR(100) | | Synonym | primary key key: name_index |
See also:
Stores data about the external databases in which the objects described in the xref table are stored.
Column | Type | Default value | Description | Index |
external_db_id | SMALLINT(5) | | Internal identifier. | primary key |
db_name | VARCHAR(100) | | Database name. | unique key: db_name_release_idx |
db_release | VARCHAR(255) | | Database release. | unique key: db_name_release_idx |
status | ENUM('KNOWNXREF','KNOWN','XREF','PRED','ORTH', 'PSEUDO') | | Status, e.g. 'KNOWNXREF','KNOWN','XREF','PRED','ORTH','PSEUDO'. | |
dbprimary_acc_linkable | BOOLEAN | 1 | Indicates if primary a accession can be linked to from the EnsEMBL web site. | |
priority | INT | | Determines which one of the xrefs will be used as the gene name. | |
db_display_name | VARCHAR(255) | | Database display name. | |
type | ENUM('ARRAY', 'ALT_TRANS', 'MISC', 'LIT', 'PRIMARY_DB_SYNONYM', 'ENSEMBL') | NULL | Type, e.g. 'ARRAY', 'ALT_TRANS', 'ALT_GENE', 'MISC', 'LIT', 'PRIMARY_DB_SYNONYM', 'ENSEMBL'. | |
secondary_db_name | VARCHAR(255) | NULL | Secondary database name. | |
secondary_db_table | VARCHAR(255) | NULL | Secondary database table. | |
description | TEXT | | Description. | |
See also:
ontology_xref |
Show columns |
This table associates ontology terms/accessions to Ensembl objects (primarily EFO/SO). NOTE: Currently not in use
Column | Type | Default value | Description | Index |
object_xref_id | INT(10) | '0' | Foreign key references to the object_xref table. | key: unique |
source_xref_id | INT(10) | NULL | Foreign key references to the xref table. | key: unique |
linkage_type | ENUM('IC', 'IDA', 'IEA', 'IEP', 'IGI', 'IMP', 'IPI', 'ISS', 'NAS', 'ND', 'TAS', 'NR', 'RCA') | | Defines type of linkage | unique |
See also:
unmapped_reason |
Show columns |
Describes the reason why a mapping failed.
Column | Type | Default value | Description | Index |
unmapped_reason_id | smallint(5) | | Internal identifier. | primary key |
summary_description | varchar(255) | NULL | Summarised description. | |
full_description | varchar(255) | NULL | Full description. | |
See also:
Core like tables
These are almost exact clones of the corresponding core schema tables. Some contain extra fields or different enum values to support the funcgen schema
Holds data about objects which are external to EnsEMBL, but need to be associated with EnsEMBL objects. Information about the database that the external object is stored in is held in the external_db table entry referred to by the external_db column.
Column | Type | Default value | Description | Index |
xref_id | INT(10) | | Internal identifier. | primary key |
external_db_id | SMALLINT | | Foreign key references to the external_db table. | unique key: id_index |
dbprimary_acc | VARCHAR(40) | | Primary accession number. | unique key: id_index |
display_label | VARCHAR(128) | | Display label for the EnsEMBL web site. | key: display_index |
version | VARCHAR(10) | '0' | Object version. | |
description | VARCHAR(255) | | Object description. | |
info_type | ENUM('PROJECTION', 'MISC', 'DEPENDENT', 'DIRECT', 'SEQUENCE_MATCH', 'INFERRED_PAIR', 'PROBE', 'UNMAPPED', 'CODING', 'TARGET') | | Class of the xref information e.g. CODING | unique key: id_index key: info_type_idx |
info_text | VARCHAR(255) | | Text | unique key: id_index |
See also:
Describes links between Ensembl objects and objects held in external databases. The Ensembl object can be one of several types; the type is held in the ensembl_object_type column. The ID of the particular Ensembl gene, translation or whatever is given in the ensembl_id column. The xref_id points to the entry in the xref table that holds data about the external object. Each Ensembl object can be associated with zero or more xrefs. An xref object can be associated with one or more Ensembl objects.
Column | Type | Default value | Description | Index |
object_xref_id | INT(10) | | Internal identifier. | primary key |
ensembl_id | INT(10) | | Foreign key references to the ensembl_object_type table e.g. probe_set | unique key: xref_idx key: ensembl_idx |
ensembl_object_type | ENUM('RegulatoryFeature', 'ExternalFeature', 'AnnotatedFeature', 'FeatureType', 'ProbeSet', 'Probe', 'ProbeFeature') | | Ensembl object type e.g ProbeSet etc. | unique key: xref_idx key: ensembl_idx |
xref_id | INT | | Foreign key references to the xref table. | unique key: xref_idx |
linkage_annotation | VARCHAR(255) | NULL | Additional annotation on the linkage. | |
analysis_id | SMALLINT(5) | | Foreign key references to the analysis table. | unique key: xref_idx key: analysis_idx |
See also:
unmapped_object |
Show columns |
Describes why a particular external entity was not mapped to an ensembl one.
Column | Type | Default value | Description | Index |
unmapped_object_id | int(10) | | Internal identifier. | primary key |
type | enum('xref', 'probe2transcript', 'array_mapping') | | UnmappedObject type e.g. probe2transcript | |
analysis_id | smallint(5) | | Foreign key references to the analysis table. | key: anal_idx key: anal_exdb_idx |
external_db_id | smallint(5) | NULL | Foreign key references to the external_db table. | key: anal_exdb_idx |
identifier | varchar(255) | | External database identifier. | key: id_idx |
unmapped_reason_id | smallint(5) | | Foreign key references to the unmapped_reason table. | |
query_score | double | NULL | Actual mapping query score. | |
target_score | double | NULL | Target mapping query score. | |
ensembl_id | int(10) | '0' | Foreign key references the table_if of the Ensembl object table e.g. probe_set | key: object_type_idx |
ensembl_object_type | enum('RegulatoryFeature','ExternalFeature','AnnotatedFeature','FeatureType', 'Probe', 'ProbeSet', 'ProbeFeature') | | Ensembl object type e.g. ProbeSet | key: object_type_idx |
parent | varchar(255) | NULL | Foreign key references to the dependent_xref table, in case the unmapped object is dependent on a primary external reference which wasn't mapped to an ensembl one. Not currently used for efg. | |
See also:
coord_system |
Show columns |
Stores information about the available co-ordinate systems for the species identified through the species_id field. For each species, there must be one co-ordinate system that has the attribute "top_level" and one that has the attribute "sequence_level". NOTE: This has been extended from the core implementation to support multiple assemblies by referencing multiple core DBs.
Column | Type | Default value | Description | Index |
coord_system_id | int(10) | | Internal identifier | key: coord_system_id_idx |
name | varchar(40) | | Co-oridinate system name, e.g. 'chromosome', 'contig', 'scaffold' etc. | primary key key: name_version_idx |
version | varchar(255) | '' | Assembly | primary key key: name_version_idx |
rank | int(11) | | Co-oridinate system rank | |
attrib | set('default_version','sequence_level') | _version' | Co-oridinate system attrib (e.g. "top_level", "sequence_level") | |
schema_build | varchar(10) | '' | Indentifies the schema_build version for the source core DB | primary key |
core_coord_system_id | int(10) | | Table ID of the coord_system in the source core DB | |
species_id | int(10) | '1' | Indentifies the species for multi-species databases | primary key key: coord_species_idx |
is_current | boolean | True | This flags which coord_system entries are current with respect to mart(/website) | |
See also:
Stores information about sequence regions from various core DBs.
Column | Type | Default value | Description | Index |
seq_region_id | int(10) | | Internal identifier. | key: seq_region_id_idx |
name | varchar(40) | | Sequence region name. | primary key |
coord_system_id | int(10) | | Foreign key references to the coord_system table. | primary key key: coord_system_id |
core_seq_region_id | int(10) | | Table ID of the seq_region in the source core DB | |
schema_build | varchar(10) | '' | Indentifies the schema_build version for the source core DB | primary key |
See also: