Foreign Data Wrappers / Federated Queries

Database	Status	Syntax Family	Min Version
PostgreSQL	full	fdw	9.1
MySQL	partial	federated-engine	-
MariaDB	full	connect-spider	-
Oracle	full	database-link	-
SQL Server	full	linked-server	-
SQLite	none	-	-
Redshift	full	spectrum-federated	-
DB2	full	nicknames	-
Snowflake	partial	external-stages	-
BigQuery	full	federated-queries	-

Compare Databases

Database Details

PostgreSQL full

-- Install the postgres_fdw extension
CREATE EXTENSION IF NOT EXISTS postgres_fdw;

-- Create a foreign server
CREATE SERVER remote_analytics
  FOREIGN DATA WRAPPER postgres_fdw
  OPTIONS (host 'db2.example.com', port '5432', dbname 'analytics');

-- Create a user mapping
CREATE USER MAPPING FOR current_user
  SERVER remote_analytics
  OPTIONS (user 'remote_user', password 'secret');

-- Create a foreign table
CREATE FOREIGN TABLE remote_orders (
  id         BIGINT,
  customer_id INT,
  total      NUMERIC
) SERVER remote_analytics
  OPTIONS (schema_name 'public', table_name 'orders');

-- Auto-import all tables from a remote schema
IMPORT FOREIGN SCHEMA public
  FROM SERVER remote_analytics
  INTO local_schema;

-- Query the foreign table like a local table
SELECT * FROM remote_orders WHERE customer_id = 42;

SQL/MED (Management of External Data) is the ISO SQL standard that FDWs implement.
postgres_fdw supports predicate and column pushdown. WHERE clauses and projections are sent to the remote server.
Community FDW wrappers exist for MySQL, Oracle, Redis, MongoDB, Parquet, CSV, HTTP/REST, and many more.
file_fdw (built-in extension) treats CSV/text files as foreign tables.
IMPORT FOREIGN SCHEMA auto-creates foreign table definitions from the remote schema catalog.

Docs: Foreign Data Wrappers, postgres_fdw, CREATE FOREIGN DATA WRAPPER

MySQL partial

-- Enable the FEDERATED storage engine (requires server restart or plugin load)
-- In my.cnf: federated

-- Create a local table that maps to a remote table
CREATE TABLE remote_orders (
  id          INT NOT NULL,
  customer_id INT,
  total       DECIMAL(10,2),
  PRIMARY KEY (id)
) ENGINE=FEDERATED
  CONNECTION='mysql://remote_user:secret@db2.example.com:3306/analytics/orders';

-- Query it like a local table
SELECT * FROM remote_orders WHERE customer_id = 42;

The FEDERATED storage engine maps a single local table definition to a single remote MySQL table.
FEDERATED is disabled by default and must be enabled in the server configuration.
Very limited compared to PostgreSQL FDWs: single table only, no JOIN pushdown to the remote server, no schema import.
MySQL 8+ documentation recommends application-level routing as an alternative for distributed data.
No support for non-MySQL remote data sources via FEDERATED.

Docs: FEDERATED Storage Engine

MariaDB full

-- CONNECT engine: query an external ODBC source as a table
CREATE TABLE ext_crm (
  id   INT,
  name VARCHAR(200)
) ENGINE=CONNECT
  TABLE_TYPE=ODBC
  DSN='MyCRM'
  TABNAME='contacts';

-- CONNECT engine: read a CSV file as a table
CREATE TABLE csv_data (
  col1 VARCHAR(100),
  col2 INT
) ENGINE=CONNECT
  TABLE_TYPE=CSV
  FILE_NAME='/data/import.csv'
  HEADER=1
  SEP_CHAR=',';

-- Spider engine: shard data across multiple MariaDB nodes
CREATE TABLE orders (
  id     INT NOT NULL,
  region VARCHAR(20),
  total  DECIMAL(10,2),
  PRIMARY KEY (id)
) ENGINE=SPIDER
  COMMENT='wrapper "mysql", host "node1", port "3306", user "app", password "secret", database "db", table "orders"';

SELECT * FROM ext_crm WHERE id = 100;

The CONNECT storage engine provides access to external data in many formats: ODBC, MySQL, Oracle (via ODBC), CSV, XML, JSON, MongoDB, REST/HTTP, and more.
The Spider storage engine enables horizontal sharding and federation across multiple MySQL/MariaDB servers with transparent JOIN support.
CONNECT is part of MariaDB's default distribution; Spider is included but may need to be installed explicitly.
CONNECT supports both read and write access to many external source types.
Spider supports distributed JOINs and aggregations across nodes.

Docs: CONNECT Storage Engine, Spider Storage Engine

Oracle full

-- Create a database link to another Oracle instance
CREATE DATABASE LINK remote_hr
  CONNECT TO hr_user IDENTIFIED BY secret
  USING 'HR_DB';

-- Query a remote table through the link
SELECT * FROM employees@remote_hr WHERE department_id = 10;

-- Join local and remote data
SELECT l.order_id, r.employee_name
FROM   local_orders l
JOIN   employees@remote_hr r ON l.emp_id = r.employee_id;

-- External Table: query a flat file as a table
CREATE TABLE ext_sales (
  sale_date DATE,
  amount    NUMBER
)
ORGANIZATION EXTERNAL (
  TYPE ORACLE_LOADER
  DEFAULT DIRECTORY data_dir
  ACCESS PARAMETERS (
    RECORDS DELIMITED BY NEWLINE
    FIELDS TERMINATED BY ','
    (sale_date CHAR DATE_FORMAT DATE MASK 'YYYY-MM-DD', amount)
  )
  LOCATION ('sales.csv')
);

SELECT * FROM ext_sales;

-- Drop the database link
DROP DATABASE LINK remote_hr;

Database Links (DBLINK) connect to other Oracle databases via Oracle Net Services (TNS).
Remote table references use the @link_name suffix syntax: table@link_name.
Heterogeneous Services (HS) enables connecting to non-Oracle databases via ODBC or OLE DB through a gateway.
External Tables (ORGANIZATION EXTERNAL) allow querying flat files (CSV, fixed-width) as read-only database tables without loading data.
Private database links are per-user; public links (CREATE PUBLIC DATABASE LINK) are available to all users.

Docs: Managing Distributed Databases, CREATE DATABASE LINK, External Tables

SQL Server full

-- Create a linked server (SQL Server to SQL Server)
EXEC sp_addlinkedserver
  @server     = N'REMOTE_SRV',
  @srvproduct = N'',
  @provider   = N'SQLNCLI',
  @datasrc    = N'remote.example.com\SQLINSTANCE';

EXEC sp_addlinkedsrvlogin
  @rmtsrvname  = N'REMOTE_SRV',
  @useself     = N'False',
  @locallogin  = NULL,
  @rmtuser     = N'sa',
  @rmtpassword = N'secret';

-- Four-part name query: [server].[database].[schema].[table]
SELECT * FROM [REMOTE_SRV].[analytics].[dbo].[orders];

-- OPENQUERY for sending native SQL to the remote server
SELECT * FROM OPENQUERY(REMOTE_SRV, 'SELECT id, total FROM dbo.orders WHERE id > 100');

-- PolyBase: query S3 or Hadoop as an external table
CREATE EXTERNAL DATA SOURCE s3_source WITH (
  LOCATION = 's3://mybucket/data/'
);

CREATE EXTERNAL TABLE ext_orders (
  id    INT,
  total DECIMAL(10,2)
) WITH (
  DATA_SOURCE = s3_source,
  LOCATION = 'orders/',
  FILE_FORMAT = CsvFileFormat
);

SELECT * FROM ext_orders;

Linked Servers support many OLE DB/ODBC providers: SQL Server, Oracle, DB2, MySQL, Excel, Access, and generic ODBC sources.
Four-part naming [server].[db].[schema].[table] enables transparent cross-server queries in T-SQL.
OPENQUERY sends the query string to the remote server for execution (pushdown) and is generally more efficient than four-part names for complex queries.
PolyBase (SQL Server 2016+) extends federation to Hadoop, Azure Blob, S3, and other big data sources.
Linked server performance can be poor for large result sets. Always push filtering to the remote side via OPENQUERY.

Docs: Linked Servers, sp_addlinkedserver, PolyBase

SQLite none

-- SQLite has no federated query support.
-- ATTACH DATABASE links another SQLite file as a second schema:
ATTACH DATABASE '/path/to/other.db' AS other_db;

-- Query the attached database
SELECT * FROM other_db.orders;

-- Join across attached databases
SELECT l.id, r.name
FROM   main.orders l
JOIN   other_db.customers r ON l.customer_id = r.id;

-- Detach when done
DETACH DATABASE other_db;

SQLite has no support for querying external non-SQLite databases.
ATTACH DATABASE allows attaching additional SQLite database files as extra schemas within the same connection.
ATTACH is limited to other SQLite files, with no ODBC, network, or non-SQLite source support.
For multi-source federation, queries must be handled at the application layer.

Docs: ATTACH DATABASE

Redshift full

-- Redshift Spectrum: query S3 data via AWS Glue Data Catalog
CREATE EXTERNAL SCHEMA spectrum_schema
  FROM DATA CATALOG
  DATABASE 'my_glue_db'
  IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftSpectrumRole'
  CREATE EXTERNAL DATABASE IF NOT EXISTS;

-- Query external S3 table
SELECT * FROM spectrum_schema.ext_orders
WHERE order_date >= '2024-01-01';

-- Federated query to Aurora PostgreSQL
CREATE EXTERNAL SCHEMA aurora_schema
  FROM POSTGRES
  DATABASE 'app_db'
  URI 'aurora-cluster.cluster-xyz.us-east-1.rds.amazonaws.com'
  PORT 5432
  SECRET_ARN 'arn:aws:secretsmanager:us-east-1:123456789012:secret:aurora-creds'
  IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftRole';

SELECT * FROM aurora_schema.customers WHERE active = TRUE;

-- Join Redshift local table with S3 external data
SELECT r.order_id, s.product_name
FROM   local_orders r
JOIN   spectrum_schema.products s ON r.product_id = s.id;

Redshift Spectrum queries data stored in Amazon S3 using the AWS Glue Data Catalog or Apache Hive Metastore as a schema registry.
Federated query supports Amazon RDS and Aurora (PostgreSQL and MySQL) as external sources for live querying.
External tables are read-only; Spectrum cannot write to S3.
Columnar formats (Parquet, ORC) are significantly more efficient for Spectrum queries.
Redshift can join local columnar data with S3 external data in a single query.
IAM roles are used for authentication to S3 and Secrets Manager for RDS credentials.

Docs: Redshift Spectrum, Federated Query, CREATE EXTERNAL SCHEMA

DB2 full

-- Create a DRDA wrapper for another DB2 instance
CREATE WRAPPER drda;

CREATE SERVER remote_db2
  TYPE DB2/UDB
  VERSION '11.5'
  WRAPPER drda
  OPTIONS (NODE 'TCPIP4', HOSTNAME 'remote.example.com', PORT '50000', DBNAME 'analytics');

CREATE USER MAPPING FOR current_user
  SERVER remote_db2
  OPTIONS (REMOTE_AUTHID 'remote_user', REMOTE_PASSWORD 'secret');

-- Create a nickname (transparent alias for the remote table)
CREATE NICKNAME remote_schema.orders
  FOR remote_db2.app_schema.orders;

-- Query the nickname like a local table
SELECT * FROM remote_schema.orders WHERE customer_id = 42;

-- Join local and remote data
SELECT l.id, r.customer_name
FROM   local_sales l
JOIN   remote_schema.orders r ON l.order_id = r.id;

DB2 Federation uses a Federated Database architecture with wrappers, servers, user mappings, and nicknames.
Nicknames are local object aliases for remote tables; SQL queries reference nicknames transparently.
The DRDA wrapper connects to DB2 instances; JDBC wrapper connects to non-DB2 relational databases.
Wrappers exist for Oracle, Sybase, Informix, ODBC sources, flat files, and web services.
DB2 Federation Server (formerly DataJoiner) is the dedicated product for complex federated scenarios.
Query pushdown (sending predicates to the remote source) is automatic when the wrapper supports it.

Docs: DB2 Federated Systems, CREATE NICKNAME

Snowflake partial

-- Create an external stage pointing to S3
CREATE STAGE my_s3_stage
  URL = 's3://mybucket/data/'
  CREDENTIALS = (AWS_KEY_ID = 'xxx' AWS_SECRET_KEY = 'yyy')
  FILE_FORMAT = (TYPE = PARQUET);

-- Create an external table over the stage
CREATE EXTERNAL TABLE ext_orders (
  id     NUMBER AS (value:id::NUMBER),
  total  FLOAT  AS (value:total::FLOAT)
)
WITH LOCATION = @my_s3_stage
FILE_FORMAT = (TYPE = PARQUET);

-- Query external table
SELECT id, total FROM ext_orders WHERE total > 1000;

-- Data Sharing: share data with another Snowflake account
CREATE SHARE analytics_share;
GRANT USAGE ON DATABASE analytics TO SHARE analytics_share;
GRANT SELECT ON TABLE analytics.public.orders TO SHARE analytics_share;
ALTER SHARE analytics_share ADD ACCOUNTS = partner_account;

External tables read data directly from cloud object storage (S3, Azure Blob, GCS) without loading it into Snowflake.
External stages define the storage location and file format; external tables are built on top of stages.
Snowflake Data Sharing provides zero-copy, live sharing of Snowflake data between accounts, with no ETL required.
Snowflake Marketplace extends data sharing to commercial and public data sets.
External tables are read-only and cannot be written to from Snowflake.
Snowflake also supports querying Apache Iceberg tables on object storage as native tables.
This is not a general SQL/MED-style framework for live queries against arbitrary remote databases, so it is narrower than PostgreSQL FDWs, SQL Server linked servers, or BigQuery federated connections.

Docs: External Tables, Data Sharing, CREATE STAGE

BigQuery full

-- External table: Cloud Storage (CSV)
CREATE OR REPLACE EXTERNAL TABLE mydataset.ext_orders
  OPTIONS (
    format = 'CSV',
    uris = ['gs://mybucket/orders/*.csv'],
    skip_leading_rows = 1
  );

-- External table: Cloud SQL (PostgreSQL) via connection
CREATE OR REPLACE EXTERNAL TABLE mydataset.pg_customers
  WITH CONNECTION `myproject.us.my-cloudsql-conn`
  OPTIONS (
    format = 'CLOUD_SQL',
    database = 'app_db',
    query = 'SELECT id, name, email FROM customers'
  );

-- Federated query to Cloud Spanner
SELECT c.customer_id, o.total
FROM   EXTERNAL_QUERY(
         'myproject.us.my-spanner-conn',
         'SELECT customer_id, total FROM orders'
       ) AS o
JOIN   mydataset.customers c ON o.customer_id = c.id;

-- Query S3 data via BigQuery Omni
SELECT * FROM `myproject.aws-us-east-1.ext_dataset.s3_orders`;

BigQuery supports external tables over Cloud Storage (CSV, JSON, Avro, Parquet, ORC), Google Drive, and Google Sheets.
EXTERNAL_QUERY() function executes SQL on Cloud SQL (PostgreSQL, MySQL), Spanner, or other connected sources and returns results to BigQuery.
BigQuery Omni extends federation to data in AWS S3 and Azure Blob Storage without moving data.
Connections to external sources use Cloud resource connections with IAM-based authentication.
External table queries run on the external engine; BigQuery handles result integration and further processing.
Bigtable can be queried as an external table for key-based lookups.

Docs: Introduction to External Data Sources, Federated Queries, BigQuery Omni