MySQL | Binary and Non-binary strings
Start your free 7-days trial now!
Binary String
A binary string is a sequence of bytes. They have the binary character set and collation, and the comparison / sorting is done based on the numeric values of the bytes.
--Setting the character set to binarySET NAMES 'binary';
--Checking the character set and collationSELECT CHARSET('apple'), COLLATION('apple');
+------------------------------------+----------------------------------------+| CHARSET('apple') | COLLATION('apple') |+------------------------------------+----------------------------------------+| 0x62696E617279 | 0x62696E617279 |+------------------------------------+----------------------------------------+
When you see 0x62696E617279
this may look like gibberish at first. This is a binary string represented in hexadecimal notation (the prefix 0x
indicates that the number is being written in hex):
Hexadecimal | Decimal value | ASCII character |
---|---|---|
62 | 98 | b |
69 | 105 | i |
6E | 110 | n |
61 | 97 | a |
72 | 114 | r |
79 | 121 | y |
Hence, if we were to read the binary string 0x62696E617279
in plain English it would read 'binary'
.
In this case ASCII characters can all be represented using 1 byte, so each letter corresponds to a pair of hexadecimal digits (e.g. 'n'
corresponds to hex '6E'
). However, this will not be the case for multibyte characters (for example the Japanese character あ
is represented in hexadecimal as 'E38182'
).
Obviously we can see here that binary strings are not very useful when we want to read the information stored as text. Binary strings are usually used to hold non-text data such as pictures and voice recordings.
Non-Binary String
A non-binary string is a sequence of characters. It is associated with a character set and collation.
The default charset in MySQL is utfmb4
and default collation is utf8mb4_0900_ai_ci
:
SELECT CHARSET('apple'), COLLATION('apple');
+------------------+--------------------+| CHARSET('apple') | COLLATION('apple') |+------------------+--------------------+| utf8mb4 | utf8mb4_0900_ai_ci |+------------------+--------------------+
We can see that non-binary strings are much more suited for text information than binary strings.