TABLESAMPLE
Efficiently sample a random subset of rows from a table without scanning or sorting all rows.
TABLESAMPLE SAMPLE BERNOULLI SYSTEM random sample
sampling random performance statistics approximation
| Database | Status | Syntax Family | Min Version |
|---|---|---|---|
| PostgreSQL | full | tablesample | 9.5 |
| MySQL | none | - | - |
| MariaDB | none | - | - |
| Oracle | full | sample | - |
| SQL Server | full | tablesample | 2005 |
| SQLite | none | - | - |
| Redshift | none | - | - |
| DB2 | full | tablesample | - |
| Snowflake | full | sample | - |
| BigQuery | partial | tablesample | - |
Compare Databases
Database Details
PostgreSQL
full
-- Sample ~10% of rows using Bernoulli method (row-level probability)
SELECT * FROM orders TABLESAMPLE BERNOULLI(10);
-- Sample ~10% of rows using System method (block-level, faster)
SELECT * FROM orders TABLESAMPLE SYSTEM(10);
-- Row-count based sampling via extension
-- CREATE EXTENSION tsm_system_rows;
SELECT * FROM orders TABLESAMPLE SYSTEM_ROWS(1000);- Two extensions broaden the built-in methods: tsm_system_rows for exact row-count sampling, tsm_system_time for time-bounded sampling.
- Built-in BERNOULLI and SYSTEM do not take a seed; the SYSTEM_ROWS method does.
MySQL
none
-- No TABLESAMPLE support. Workaround: random fraction via WHERE
SELECT * FROM orders WHERE RAND() < 0.10 LIMIT 10000;
-- Workaround: ORDER BY RAND() (slow on large tables)
SELECT * FROM orders ORDER BY RAND() LIMIT 1000;- No TABLESAMPLE. WHERE RAND() < p is cheap but the row count varies; ORDER BY RAND() LIMIT n is exact but full-scans and sorts.
- On large tables, sampling primary keys first and joining back is cheaper than either.
Docs: SELECT Syntax
MariaDB
none
-- No TABLESAMPLE support. Workaround: random fraction via WHERE
SELECT * FROM orders WHERE RAND() < 0.10;
-- Workaround: ORDER BY RAND() (slow on large tables)
SELECT * FROM orders ORDER BY RAND() LIMIT 1000;- No TABLESAMPLE. Same WHERE RAND() / ORDER BY RAND() trade-offs as MySQL; sample primary keys first on large tables.
Docs: SELECT
Oracle
full
-- Row-level sample: approximately 10% of rows
SELECT * FROM orders SAMPLE(10);
-- Block-level sample: approximately 10% of data blocks
SELECT * FROM orders SAMPLE BLOCK(10);
-- Reproducible sample with a seed value
SELECT * FROM orders SAMPLE(10) SEED(42);- SAMPLE(p) is per-row (≈ BERNOULLI); SAMPLE BLOCK(p) is per-block (≈ SYSTEM); SEED(n) makes results reproducible on the same data.
- p must be between 0.000001 and 99.999999.
Docs: SELECT - SAMPLE Clause
SQL Server
full
-- Sample approximately 10% of rows
SELECT * FROM orders TABLESAMPLE (10 PERCENT);
-- Sample approximately 1000 rows (approximate)
SELECT * FROM orders TABLESAMPLE (1000 ROWS);
-- Reproducible sample with a seed
SELECT * FROM orders TABLESAMPLE (10 PERCENT) REPEATABLE(42);- Page-level sampling only (no BERNOULLI). Both (n PERCENT) and (n ROWS) are approximate.
- REPEATABLE(seed) makes results reproducible on the same data.
- Not allowed on views, linked-server tables, or table-valued-function output.
SQLite
none
-- No TABLESAMPLE support. Workaround: approximate fraction using random()
SELECT * FROM orders WHERE abs(random()) % 10 = 0;
-- Workaround: approximate percentage using random range
SELECT * FROM orders WHERE random() > 8223372036854775807 * 0.9;- No TABLESAMPLE. WHERE random() % n = 0 gives an approximate 1/n sample at the cost of a full scan. No block-level or seeded sampling.
Docs: SQLite SELECT
Redshift
none
-- No TABLESAMPLE support. Workaround: random fraction via WHERE
SELECT * FROM orders WHERE random() < 0.10;
-- Workaround: CTAS-based sampling
CREATE TABLE orders_sample AS
SELECT * FROM orders WHERE random() < 0.10;- No TABLESAMPLE. Use WHERE random() < p for an approximate fraction.
- For reproducible sampling, filter on a hash of a stable key (e.g. WHERE STRTOL(MD5(key), 16) % 10 = 0).
Docs: SELECT
DB2
full
-- Row-level Bernoulli sampling (~10%)
SELECT * FROM orders TABLESAMPLE BERNOULLI(10);
-- Block-level system sampling (~10%)
SELECT * FROM orders TABLESAMPLE SYSTEM(10);- Available in DB2 for LUW; syntax and availability differ on z/OS and older releases.
- Some releases support REPEATABLE(seed) for reproducible samples.
Docs: TABLESAMPLE clause
Snowflake
full
-- Row-fraction Bernoulli sample: ~10% of rows
SELECT * FROM orders SAMPLE (10);
-- Block-level sample: ~10% of blocks
SELECT * FROM orders SAMPLE BLOCK (10);
-- Exact row-count sample
SELECT * FROM orders SAMPLE ROW (1000 ROWS);
-- Reproducible sample with a seed
SELECT * FROM orders SAMPLE (10) SEED(42);
-- TABLESAMPLE is a synonym for SAMPLE
SELECT * FROM orders TABLESAMPLE BERNOULLI (10);- SAMPLE and TABLESAMPLE are synonyms. Default is BERNOULLI; BLOCK is the per-micropartition equivalent of SYSTEM.
- ROW (n ROWS) takes an exact row count via BERNOULLI; SEED(n) makes the sample reproducible.
Docs: SAMPLE / TABLESAMPLE
BigQuery
partial
-- System sampling: ~10% of data at partition/shard level
SELECT * FROM `project.dataset.orders` TABLESAMPLE SYSTEM (10 PERCENT);- SYSTEM only; no BERNOULLI, no REPEATABLE. Results are non-deterministic between runs.
- On partitioned tables, sampling operates per partition. Primary benefit is reducing scanned bytes (and cost).
Docs: TABLESAMPLE operator