A string is a sequence of bytes or characters, enclosed within
either single quote (“'
”) or
double quote (“"
”) characters.
Examples:
'a string' "another string"
Quoted strings placed next to each other are concatenated to a single string. The following lines are equivalent:
'a string' 'a' ' ' 'string'
If the ANSI_QUOTES
SQL mode is
enabled, string literals can be quoted only within single quotes
because a string quoted within double quotes is interpreted as
an identifier.
As of MySQL 4.1.1, a binary string is a string of bytes that has no character set or collation. A nonbinary string is a string of characters that has a character set and collation. For both types of strings, comparisons are based on the numeric values of the string unit. For binary strings, the unit is the byte. For nonbinary strings the unit is the character and some character sets allow multi-byte characters. Character value ordering is a function of the string collation.
Also as of MySQL 4.1.1, string literals may have an optional
character set introducer and COLLATE
clause:
[_charset_name
]'string
' [COLLATEcollation_name
]
Examples:
SELECT _latin1'string
'; SELECT _latin1'string
' COLLATE latin1_danish_ci;
You can use
N'
(or
literal
'n'
) to
create a string in the national character set. These statements
are equivalent:
literal
'
SELECT N'some text'; SELECT n'some text'; SELECT _utf8'some text';
For more information about these forms of string syntax, see Section 9.1.3.5, “Character String Literal Character Set and Collation”, and Section 9.1.3.6, “National Character Set”.
Within a string, certain sequences have special meaning. Each of
these sequences begins with a backslash
(“\
”), known as the
escape character. MySQL recognizes the
following escape sequences.
For all other escape sequences, backslash is ignored. That is,
the escaped character is interpreted as if it was not escaped.
For example, “\x
” is just
“x
”.
These sequences are case sensitive. For example,
“\b
” is interpreted as a
backspace, but “\B
” is
interpreted as “B
”.
The ASCII 26 character can be encoded as
“\Z
” to enable you to work
around the problem that ASCII 26 stands for END-OF-FILE on
Windows. ASCII 26 within a file causes problems if you try to
use mysql
.
db_name
<
file_name
Escape processing is done according to the character set
indicated by the
character_set_connection
system
variable. This is true even for strings that are preceded by an
introducer that indicates a different character set, as
discussed in Section 9.1.3.5, “Character String Literal Character Set and Collation”.
The “\%
” and
“\_
” sequences are used to
search for literal instances of
“%
” and
“_
” in pattern-matching contexts
where they would otherwise be interpreted as wildcard
characters. See the description of the
LIKE
operator in
Section 11.4.1, “String Comparison Functions”. If you use
“\%
” or
“\_
” outside of pattern-matching
contexts, they evaluate to the strings
“\%
” and
“\_
”, not to
“%
” and
“_
”.
There are several ways to include quote characters within a string:
A “'
” inside a string quoted
with “'
” may be written as
“''
”.
A “"
” inside a string quoted
with “"
” may be written as
“""
”.
Precede the quote character by an escape character
(“\
”).
A “'
” inside a string quoted
with “"
” needs no special
treatment and need not be doubled or escaped. In the same
way, “"
” inside a string
quoted with “'
” needs no
special treatment.
The following SELECT
statements
demonstrate how quoting and escaping work:
mysql>SELECT 'hello', '"hello"', '""hello""', 'hel''lo', '\'hello';
+-------+---------+-----------+--------+--------+ | hello | "hello" | ""hello"" | hel'lo | 'hello | +-------+---------+-----------+--------+--------+ mysql>SELECT "hello", "'hello'", "''hello''", "hel""lo", "\"hello";
+-------+---------+-----------+--------+--------+ | hello | 'hello' | ''hello'' | hel"lo | "hello | +-------+---------+-----------+--------+--------+ mysql>SELECT 'This\nIs\nFour\nLines';
+--------------------+ | This Is Four Lines | +--------------------+ mysql>SELECT 'disappearing\ backslash';
+------------------------+ | disappearing backslash | +------------------------+
If you want to insert binary data into a string column (such as
a BLOB
column), the following
characters must be represented by escape sequences.
NUL |
NUL byte (0x00 ). Represent this
character by “\0 ” (a
backslash followed by an ASCII
“0 ” character). |
\ |
Backslash (ASCII 92). Represent this character by
“\\ ”. |
' |
Single quote (ASCII 39). Represent this character by
“\' ”. |
" |
Double quote (ASCII 34). Represent this character by
“\" ”. |
When writing application programs, any string that might contain any of these special characters must be properly escaped before the string is used as a data value in an SQL statement that is sent to the MySQL server. You can do this in two ways:
Process the string with a function that escapes the special
characters. In a C program, you can use the
mysql_real_escape_string()
C
API function to escape characters. See
Section 17.6.3.51, “mysql_real_escape_string()
”. The Perl DBI
interface provides a quote
method to
convert special characters to the proper escape sequences.
See Section 17.8, “MySQL Perl API”. Other language interfaces
may provide a similar capability.
As an alternative to explicitly escaping special characters, many MySQL APIs provide a placeholder capability that enables you to insert special markers into a statement string, and then bind data values to them when you issue the statement. In this case, the API takes care of escaping special characters in the values for you.
User Comments
Whenever the server expects a binary string, e.g. when a binary column is assigned in an INSERT statement, it simply ignores the charset of the non-binary input string and uses the value as-is.
In particular, this has the effect that you can't send a binary value UTF-8 encoded with a _utf8 introducer and expect it to be decoded before being assigned to a binary column.
You also can't send a binary value as-is without any introducer if you have character_set_connection set to utf8. That will cause the value to be UTF-8 encoded when it arrives to the server (see http://dev.mysql.com/doc/en/charset-connection.html) and then be assigned to the binary column.
Therefore, to send a binary value over a UTF-8 connection (i.e. a connection that has been set up with "SET NAMES 'utf8'"), do not encode it but specify an introducer. The introducer will keep it from being interpreted as UTF-8 (through the character_set_client setting) and being encoded as UTF-8 (through character_set_connection), so it goes through untouched to the binary assignment (that ignores the charset specified by the introducer).
Since no encoding or decoding takes place, the introducer can be anything from that perspective. The charset specified by the introducer does however affect the default collation, so a good choice is _binary since it gives plain byte-by-byte comparison if the string is used in a comparison expression.
Add your own comment.