The GFF (General Feature Format) format consists of one line per feature, each containing 9 columns of data, plus optional track definition lines. The following documentation is based on the Version 2 specifications.
The GTF (General Transfer Format) is identical to GFF version 2.
Fields must be tab-separated. Also, all but the final field in each feature line must contain a value; "empty" columns should be denoted with a '.'
Sample GFF output from Ensembl export:
X Ensembl Repeat 2419108 2419128 42 . . hid=trf; hstart=1; hend=21 X Ensembl Repeat 2419108 2419410 2502 - . hid=AluSx; hstart=1; hend=303 X Ensembl Repeat 2419108 2419128 0 . . hid=dust; hstart=2419108; hend=2419128 X Ensembl Pred.trans. 2416676 2418760 450.19 - 2 genscan=GENSCAN00000019335 X Ensembl Variation 2413425 2413425 . + . X Ensembl Variation 2413805 2413805 . + .
Although not part of the formal GFF specification, Ensembl will use track lines to further configure sets of features.
The track line consists of the word 'track' followed by space-separated key=value pairs - see the example below. Valid parameters used by Ensembl are:
For more information about this file format, see the documentation on the Sanger Institute website.