Collation Support
Rules for string comparison, sorting, and equality that account for case, accent marks, locale, and Unicode normalization.
collation COLLATE case-insensitive accent-insensitive ICU citext locale character set
collation string comparison locale case-insensitive Unicode sorting
| Database | Status | Syntax Family | Min Version |
|---|---|---|---|
| PostgreSQL | full | collate | - |
| MySQL | full | character-set-collate | - |
| MariaDB | full | character-set-collate | - |
| Oracle | full | nls-collate | - |
| SQL Server | full | collation-name | - |
| SQLite | partial | collate | - |
| Redshift | partial | collate | - |
| DB2 | full | - | - |
| Snowflake | partial | collate | - |
| BigQuery | full | collate | - |
Compare Databases
Database Details
PostgreSQL
full
-- Column with ICU collation (case-insensitive, accent-insensitive)
CREATE TABLE users (
username TEXT COLLATE "und-x-icu"
);
-- Case-insensitive comparison using nondeterministic ICU collation
CREATE COLLATION case_insensitive (
provider = icu,
locale = 'und-u-ks-level2',
deterministic = false
);
CREATE TABLE users2 (username TEXT COLLATE case_insensitive);
SELECT * FROM users2 WHERE username = 'Alice'; -- matches 'alice', 'ALICE'
-- Per-expression collation
SELECT * FROM users ORDER BY username COLLATE "C";
-- citext extension for case-insensitive text type
CREATE EXTENSION citext;
CREATE TABLE users3 (username CITEXT);
SELECT * FROM users3 WHERE username = 'alice'; -- matches 'Alice', 'ALICE'- Standard C collation performs byte-by-byte comparison (fastest, locale-unaware).
- ICU collations (v10+) support locale-aware sorting and nondeterministic comparison.
- Nondeterministic collations allow case/accent-insensitive equality but cannot be used in unique indexes (v12+).
- citext extension provides a case-insensitive text type as a simpler alternative.
- COLLATE clause can be applied per-column, per-expression, or per-index.
MySQL
full
-- Column with case-insensitive, accent-insensitive collation
CREATE TABLE users (
username VARCHAR(100)
CHARACTER SET utf8mb4
COLLATE utf8mb4_0900_ai_ci
);
-- Case-sensitive column
CREATE TABLE codes (
code VARCHAR(20) COLLATE utf8mb4_0900_as_cs
);
-- Per-expression COLLATE
SELECT * FROM users
WHERE username COLLATE utf8mb4_0900_as_cs = 'Alice';
-- Show available collations
SHOW COLLATION WHERE Charset = 'utf8mb4';- Collation names follow the pattern: charset_locale_variant_sensitivity.
- _ci = case-insensitive, _cs = case-sensitive, _ai = accent-insensitive, _as = accent-sensitive, _bin = binary.
- utf8mb4_0900_ai_ci is the modern default for Unicode 9.0 support.
- utf8mb4_unicode_ci is widely used for older Unicode compatibility.
- Character set and collation can be set at server, database, table, column, or expression level.
MariaDB
full
CREATE TABLE users (
username VARCHAR(100)
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
);
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice', 'ALICE'- Same per-column CHARACTER SET and COLLATE syntax as MySQL.
- Supports utf8mb4_unicode_ci, utf8mb4_general_ci, and many locale-specific collations.
- uca1400 collation variants added in MariaDB 10.10 for Unicode 14.0 support.
Oracle
full
-- Session-level case-insensitive comparison
ALTER SESSION SET NLS_COMP = LINGUISTIC;
ALTER SESSION SET NLS_SORT = BINARY_CI; -- case-insensitive
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice', 'ALICE'
-- Per-expression COLLATE (Oracle 12c+)
SELECT * FROM users
WHERE username COLLATE BINARY_CI = 'alice';
-- Column-level collation
CREATE TABLE users (
username VARCHAR2(100) COLLATE BINARY_CI
);- NLS_COMP=LINGUISTIC + NLS_SORT= controls session-wide comparison behavior.
- BINARY_CI = case-insensitive binary; BINARY_AI = accent-insensitive binary.
- Per-column COLLATE clause available since Oracle 12c.
- NLSSORT() function computes sort keys for linguistic ordering.
- Oracle's NLS (National Language Support) system is very flexible and mature.
SQL Server
full
-- Column with case-insensitive, accent-sensitive collation
CREATE TABLE users (
username NVARCHAR(100) COLLATE Latin1_General_CI_AS
);
-- Case-sensitive column
CREATE TABLE codes (
code NVARCHAR(20) COLLATE Latin1_General_CS_AS
);
-- Per-expression COLLATE
SELECT * FROM users
WHERE username COLLATE Latin1_General_CS_AS = N'Alice';
-- List available collations
SELECT name, description FROM sys.fn_helpcollations()
WHERE name LIKE 'Latin1_General%';- Collation names encode: charset, CI/CS (case), AI/AS (accent), KI/KS (kana-type), WI/WS (width), UTF8.
- Latin1_General_CI_AS is a very common default: case-insensitive, accent-sensitive.
- Windows collations (Latin1_General_*) are locale-aware; SQL collations (SQL_*) are legacy.
- Database-level collation affects system catalog comparisons.
- Column-level collation overrides database default.
SQLite
partial
-- NOCASE collation (ASCII case-insensitive only)
CREATE TABLE users (
username TEXT COLLATE NOCASE
);
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice' (ASCII only)
-- Per-expression COLLATE
SELECT * FROM users ORDER BY username COLLATE NOCASE;
-- BINARY (default) and RTRIM collations
CREATE INDEX idx_user_rtrim ON users(username COLLATE RTRIM);- Built-in collations: BINARY (default, byte-order), NOCASE (ASCII case-insensitive), RTRIM (ignores trailing spaces).
- NOCASE only folds ASCII A-Z/a-z; does not handle Unicode case folding (e.g., 'ü' != 'Ü').
- Full Unicode collation requires a custom collation registered via sqlite3_create_collation() C API.
- The ICU extension provides Unicode-aware collations when compiled in.
Docs: Collating Sequences
Redshift
partial
-- Case-insensitive column (Redshift 2022+)
CREATE TABLE users (
username VARCHAR(100) COLLATE case_insensitive
);
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice'- Basic COLLATE clause support added in 2022.
- Supports case_insensitive and case_sensitive collation options.
- Limited options compared to PostgreSQL's ICU collation support.
- Collation affects ORDER BY, comparisons, and GROUP BY behavior.
Docs: Collation Sequences
DB2
full
-- Database created with Unicode collation
-- CREATE DATABASE mydb COLLATE UCA500R1_LEN_S1;
-- Per-expression COLLATE (DB2 11.1+)
SELECT * FROM users
WHERE username = 'alice' COLLATE UNICODE_CI_AI;
-- Current collation
SELECT CURRENT COLLATION FROM SYSIBM.SYSDUMMY1;- Collation is set at database creation time with the COLLATE option.
- SYSTEM_819 provides byte-order (binary) collation.
- UCA500R1_LEN_S1 provides Unicode Collation Algorithm support.
- Per-column and per-expression collation overrides available in newer versions.
- Changing collation after database creation is not straightforward - requires migration.
Docs: String Collations
Snowflake
partial
-- Column with case-insensitive collation
CREATE TABLE users (
username STRING COLLATE 'en-ci'
);
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice'
-- Per-expression COLLATE
SELECT * FROM users
WHERE COLLATE(username, 'en-ci') = 'alice';
-- Account-level default collation
ALTER ACCOUNT SET DEFAULT_DDL_COLLATION = 'en-ci';- Collation support expanded significantly in 2022+.
- Supports 'en-ci' (English, case-insensitive) and locale-specific collations.
- Account, database, schema, table, and column-level collation defaults.
- COLLATE() function for per-expression overrides.
- Not all collation locales available in all Snowflake editions.
Docs: Collation Support
BigQuery
full
-- Column with case-insensitive collation
CREATE TABLE users (
username STRING COLLATE 'und:ci'
);
SELECT * FROM users WHERE username = 'alice'; -- matches 'Alice', 'ALICE'
-- Per-expression COLLATE
SELECT * FROM users
WHERE COLLATE(username, 'und:ci') = 'alice';
-- ORDER BY with collation
SELECT username FROM users ORDER BY username COLLATE 'und:ci';- Collation support added in 2022.
- 'und:ci' is case-insensitive Unicode; 'und' (binary) is the default.
- Collation propagates through comparisons, ORDER BY, GROUP BY, and DISTINCT.
- Table-level DEFAULT COLLATE sets the default for all STRING columns.
- Per-expression COLLATE() overrides column-level collation.
Docs: Collation