pg_ducklake

DuckLake lakehouse extension for PostgreSQL, backed by DuckDB and Parquet

Overview

PackageVersionCategoryLicenseLanguage
pg_ducklake1.0.0OLAPMITC++
IDExtensionBinLibLoadCreateTrustRelocSchema
2490pg_ducklakeNoYesYesYesNoNoducklake
Relatedpg_duckdb duckdb_fdw pg_mooncake pg_analytics pg_parquet columnar citus_columnar

Version

TypeRepoVersionPG VerPackageDeps
EXTPIGSTY1.0.01817161514pg_ducklake-
RPMPIGSTY1.0.01817161514pg_ducklake_$v-
DEBPIGSTY1.0.01817161514postgresql-$v-pg-ducklake-
OS / PGPG18PG17PG16PG15PG14
el8.x86_64PIGSTY MISSPIGSTY MISSPIGSTY MISSPIGSTY MISSPIGSTY MISS
el8.aarch64PIGSTY MISSPIGSTY MISSPIGSTY MISSPIGSTY MISSPIGSTY MISS
el9.x86_64
el9.aarch64
el10.x86_64
el10.aarch64
d12.x86_64
d12.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d13.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d13.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u22.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u22.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u24.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u24.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u26.x86_64
u26.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0

Build

You can build the RPM / DEB packages for pg_ducklake using pig build:

pig build pkg pg_ducklake         # build RPM / DEB packages

Install

You can install pg_ducklake directly. First, make sure the PGDG and PIGSTY repositories are added and enabled:

pig repo add pgsql -u          # Add repo and update cache

Install the extension using pig or apt/yum/dnf:

pig install pg_ducklake;          # Install for current active PG version
pig ext install -y pg_ducklake -v 18  # PG 18
pig ext install -y pg_ducklake -v 17  # PG 17
pig ext install -y pg_ducklake -v 16  # PG 16
pig ext install -y pg_ducklake -v 15  # PG 15
pig ext install -y pg_ducklake -v 14  # PG 14
dnf install -y pg_ducklake_18       # PG 18
dnf install -y pg_ducklake_17       # PG 17
dnf install -y pg_ducklake_16       # PG 16
dnf install -y pg_ducklake_15       # PG 15
dnf install -y pg_ducklake_14       # PG 14
apt install -y postgresql-18-pg-ducklake   # PG 18
apt install -y postgresql-17-pg-ducklake   # PG 17
apt install -y postgresql-16-pg-ducklake   # PG 16
apt install -y postgresql-15-pg-ducklake   # PG 15
apt install -y postgresql-14-pg-ducklake   # PG 14

Preload:

shared_preload_libraries = 'pg_ducklake';

Create Extension:

CREATE EXTENSION pg_ducklake;

Usage

Sources: README, v1.0.0 release, project docs

pg_ducklake adds DuckLake tables to PostgreSQL. DuckLake metadata lives in PostgreSQL while table data is stored in Parquet and queried through DuckDB, giving PostgreSQL SQL clients access to lakehouse features such as snapshots, time travel, partitioning, sort keys, and external object storage.

Create A DuckLake Table

CREATE EXTENSION pg_ducklake;

CREATE TABLE events (
  id int,
  kind text,
  ts timestamptz
) USING ducklake;

INSERT INTO events VALUES
  (1, 'login', now()),
  (2, 'click', now());

SELECT * FROM events ORDER BY id;

Set a table path explicitly when data should live outside the default path:

CREATE TABLE lake_events (
  id int,
  payload jsonb
) WITH (
  ducklake.table_path = 's3://my-bucket/prefix/'
) USING ducklake;

Time Travel

Each commit creates a snapshot. Capture a snapshot id before a change, then query the older state:

SELECT max(snapshot_id) AS before_delete
FROM ducklake.ducklake_snapshot \gset

DELETE FROM events WHERE id = 1;

SELECT * FROM ducklake.time_travel('events'::regclass, :before_delete);

Convert And Load Data

Create DuckLake tables from existing PostgreSQL heap tables or external data readers:

CREATE TABLE row_store AS
SELECT i AS id, 'hello pg_ducklake' AS msg
FROM generate_series(1, 10000) AS i;

CREATE TABLE col_store USING ducklake AS
SELECT * FROM row_store;

CREATE TABLE titanic USING ducklake AS
SELECT * FROM ducklake.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv');

Inlining, Partitioning, And Maintenance

Small writes are inlined in metadata by default to avoid creating many tiny Parquet files. Tune the row limit or flush explicitly:

CALL ducklake.set_option('data_inlining_row_limit', 100);
SELECT * FROM ducklake.flush_inlined_data('events'::regclass);

Partition and sort tables for pruning and analytics:

CALL ducklake.set_partition('events'::regclass, 'bucket(4, id)', 'month(ts)');
CREATE INDEX ON events USING ducklake_sorted (id, ts);

Run maintenance on demand when automatic background maintenance is not enough:

SELECT * FROM ducklake.merge_adjacent_files('events'::regclass);
CALL ducklake.set_option('expire_older_than', '7 days');
SELECT * FROM ducklake.expire_snapshots();
SELECT * FROM ducklake.cleanup_old_files();

External DuckDB Access

DuckDB clients can attach the same DuckLake metadata:

INSTALL ducklake;
LOAD ducklake;
ATTACH 'ducklake:postgres:dbname=postgres host=localhost' AS my_ducklake
  (METADATA_SCHEMA 'ducklake');

SELECT * FROM my_ducklake.public.events;

Caveats

  • Version 1.0.0 supports PostgreSQL 14-18.
  • The README lists Ubuntu 22.04-24.04 and macOS as source-build targets.
  • Cloud credentials are stored through a ducklake_secret foreign server and per-user mappings; protect those catalog objects like other database secrets.
  • For incremental heap-to-DuckLake conversion, upstream points to the separate pg_duckpipe project.

Last Modified 2026-07-02: extension update 2026-07-02 (f9f0d13)