MySQL :: MySQL 5.4 Reference Manual :: 7.4.2 Column Indexes

MySQL 5.4 Reference Manual :: 7 Optimization :: 7.4 Optimizing Database Structure :: 7.4.2 Column Indexes

« 7.4.1 Make Your Data as Small as Possible

Section Navigation [Toggle]

7.4.2. Column Indexes

All MySQL data types can be indexed. Use of indexes on the relevant columns is the best way to improve the performance of SELECT operations.

The maximum number of indexes per table and the maximum index length is defined per storage engine. See Chapter 13, Storage Engines. All storage engines support at least 16 indexes per table and a total index length of at least 256 bytes. Most storage engines have higher limits.

With col_name(N) syntax in an index specification, you can create an index that uses only the first N characters of a string column. Indexing only a prefix of column values in this way can make the index file much smaller. When you index a BLOB or TEXT column, you must specify a prefix length for the index. For example:

CREATE TABLE test (blob_col BLOB, INDEX(blob_col(10)));

Prefixes can be up to 1000 bytes long (767 bytes for InnoDB tables). Note that prefix limits are measured in bytes, whereas the prefix length in CREATE TABLE statements is interpreted as number of characters. Be sure to take this into account when specifying a prefix length for a column that uses a multi-byte character set.

You can also create FULLTEXT indexes. These are used for full-text searches. Only the MyISAM storage engine supports FULLTEXT indexes and only for CHAR, VARCHAR, and TEXT columns. Indexing always takes place over the entire column and column prefix indexing is not supported. For details, see Section 11.8, “Full-Text Search Functions”.

You can also create indexes on spatial data types. Currently, only MyISAM supports R-tree indexes on spatial types. Other storage engines use B-trees for indexing spatial types (except for ARCHIVE, which does not support spatial type indexing).

The MEMORY storage engine uses HASH indexes by default, but also supports BTREE indexes.

Previous / Next / Up / Table of Contents

User Comments

Posted by Neil Davis on January 18 2008 7:30pm

[Delete] [Edit]

We've seen a huge increase in speed by limiting the index to the first few characters in the situation where the tablescan would be more efficient than the index, such as indexing a field like a software key with unique string data:

aaaa-bbbb-cccc-dddd

The best solution is to make the index more efficient by reducing cardinality. Index the first 3-4 characters instead of the whole field. Not only does the index do it's job better, it gets reduced in size a great deal so the storage requirements go down as mentioned.

This gave us a gigantic boost to overall application speed when applied over similarly unique fields in the rest of the tables. In example, a particular customer with products and contacts joined together had over 300 licenses. Each product is joined in a one to many join to 1+n contacts (which is used to control and store access levels to the licensing), and since he has 50 employees, the join is huge(50x300). As well a last downloaded join is calculated by joining on the software_key to the download database with an order by date, limit 1.

His product list screen took over 14 seconds to load. After optimizing the index implementation this worst case has a 2 second load time.

I didn't design this, and the designer didn't see this issue because there weren't 26m records in the db. As soon as I looked at the indexes, I began to understand what was happening with a little help from this page. I thought I'd share my experience with this situation to illustrate how important this is.

ex:
select software_key from product where software_key = 'aaaa-bbbb-cccc-dddd'

|software_key|
|aaaa-bbbb-cccc-dddd|

ALTER TABLE `product` DROP INDEX `idx_software_key` ,
ADD INDEX `idx_software_key` ( `software_key` ( 3 ) );

Will create an index that is appropriate, as long as the first 3 characters change from code to code. If it's static, this won't work. In that case you can insert the codes using REVERSE(), using them reversed (internally to the database) for joins, indexes etc, and read it back out with REVERSE() in your queries so they display properly.

This has the potential to significantly reduce query times in a lot of cases, and actually does in practice. The results were jaw dropping with complex queries that have more than one join.

This is probably the most important MySQL subject to grok (from a developer's standpoint) if you want your database application to run optimally. Using indexes is a good thing to do and very important, using them efficiently will ensure that your application runs smoothly and the mem/disk io is kept to acceptable levels.

Sometimes less really is more.

-Neil

Add your own comment.

Top / Previous / Next / Up / Table of Contents