Return the number of bytes in the input string's UTF-8 encoding.
OCTET_LENGTH(str)
## Overview Returns the number of bytes (octets) occupied by the UTF-8 encoding of the input string. Every ASCII character contributes 1 byte, Latin accented characters contribute 2 bytes, most CJK characters contribute 3 bytes, and characters outside the Basic Multilingual Plane (including most emoji) contribute 4 bytes. OCTET_LENGTH is the correct tool for estimating on-disk or on-wire size. Because CHAR_LENGTH counts code points rather than bytes, OCTET_LENGTH is always greater than or equal to CHAR_LENGTH. Equality holds only for pure ASCII strings. ## Behavior - Returns NULL when the input is NULL. - Returns 0 for an empty string. - Always equals BIT_LENGTH(str) / 8. - Identical to CHAR_LENGTH for ASCII-only strings; strictly greater for any input containing non-ASCII characters. - Applies to the UTF-8 canonical form of the string as stored by the engine. Alternative normalisation forms (NFC vs NFD) produce different byte counts. - Accepts VARBINARY and BYTEA types, in which case it simply returns the byte length of the binary value. ## Compatibility - Conforms to the SQL standard OCTET_LENGTH definition. - Assumes UTF-8 as the storage encoding. Other encodings are not supported as input strings.
| Name | Type | Description |
|---|---|---|
str | Specifies the input string whose UTF-8 byte length is computed. |
-- ASCII: 1 byte per character
SELECT OCTET_LENGTH('hello'); -- 5
-- Empty string
SELECT OCTET_LENGTH(''); -- 0
-- Single 2-byte UTF-8 character
SELECT OCTET_LENGTH('é'); -- 2
-- Mixed ASCII and multi-byte characters
SELECT OCTET_LENGTH('café'); -- 5
-- 4-byte emoji
SELECT OCTET_LENGTH('😀'); -- 4
-- Estimate storage in bytes for a text column
SELECT SUM(OCTET_LENGTH(description)) AS bytes_total
FROM retail.products.catalog;