Ecosystem & Integrations — DuckDB Course

Official Tools & Extensions

DuckDB ships with a set of core extensions that are either bundled or auto-installable via INSTALL / LOAD. These cover the most common data formats, external data sources, and specialized functionality.

Core Extensions

httpfs Data

Read/write files from HTTP, S3, GCS, and Azure Blob Storage. Enables SELECT * FROM read_parquet('s3://...') directly.

json Data

Native JSON reading with automatic schema inference. Handles JSONL, nested JSON, and mixed-type arrays.

parquet Storage

Advanced Parquet support including predicate pushdown, column pruning, and partitioned writes.

postgres_scanner Connector

Attach and query PostgreSQL databases directly from DuckDB. Supports filter pushdown to the remote server.

mysql_scanner Connector

Attach and query MySQL databases directly, with the same filter-pushdown optimization as the Postgres scanner.

sqlite_scanner Connector

Attach SQLite databases and query them via DuckDB's analytical engine -- ideal for migrating OLTP data to analytics.

spatial Geo

Geospatial functions (ST_Distance, ST_Contains, etc.) with the built-in GEOMETRY type (v1.5.0+).

iceberg Storage

Read Apache Iceberg tables, enabling DuckDB as a lightweight lakehouse query engine.

delta Storage

Read Delta Lake tables with full support for time travel and schema evolution.

fts Search

Full-text search with BM25 scoring, enabling WHERE fts_match(content, 'query') in SQL.

excel Data

Read and write Excel files (.xlsx) -- bridging the gap between spreadsheets and analytical SQL.

Official Tools

💻 DuckDB CLI (v1.5.0+)

The command-line client received a major overhaul with syntax highlighting, dynamic prompts showing current database/schema, a built-in pager, the _ operator to reference previous results, and .tables / DESCRIBE for schema exploration.

🌐 DuckDB WASM

The full DuckDB engine compiled to WebAssembly for browser environments. Powers interactive data applications, notebook environments, and serverless analytics.

Community Ecosystem

With 127+ community extensions, DuckDB's ecosystem is growing rapidly. Notable community-built extensions:

mssql Connector

Native TDS protocol communication with SQL Server -- zero dependencies, TLS/SSL, connection pooling.

snowflake Connector

Query Snowflake tables directly from DuckDB via ADBC.

mongo Connector

SQL queries against MongoDB with automatic schema inference and filter pushdown.

infera ML

Run ONNX ML models inside SQL queries for in-database inference.

onager Graph

Graph analytics (centrality, community detection) implemented in Rust.

dns Utility

DNS lookup and reverse DNS as SQL functions.

gaggle Data

Query Kaggle datasets directly via SQL.

💡

127+ community extensions and growing. The DuckDB Community Extensions repository lets anyone publish and distribute extensions through a single INSTALL command. Additionally, DuckDB.ExtensionKit now enables building extensions in C# using .NET Native AOT compilation, broadening the extension developer community beyond C/C++.

Common Integration Patterns

DuckDB's embeddable design, FSST-compressed storage, and zero-copy Arrow interop make it a natural fit for a wide range of data tools and workflows.

🏗

DuckDB + dbt

dbt-duckdb dbt Core

Run dbt transformations locally with DuckDB. Ideal for development and testing of data models before deploying to a production warehouse. Used by the NSW Department of Education for their data portal.

🐍

DuckDB + Pandas/Polars

pandas polars Apache Arrow

DuckDB queries pandas DataFrames and Polars LazyFrames directly without copying data (via Arrow). Use DuckDB for complex SQL, then hand results back to Python for visualization or ML.

☁

DuckDB + MotherDuck

MotherDuck hybrid execution

MotherDuck extends DuckDB to the cloud with hybrid query processing -- queries execute partly on client, partly in cloud, with the optimizer choosing the most efficient split. Share databases via URLs for seamless collaboration.

🏠

DuckDB + Lakehouse

Iceberg Delta Lake S3/GCS

Use DuckDB as a lightweight query engine for data lakehouses. Read Iceberg or Delta tables from S3/GCS, benefiting from vectorized execution without spinning up Spark clusters. Ideal for ad-hoc exploration.

📊

DuckDB + BI Tools

Evidence Rill Hex

Multiple BI tools embed DuckDB as their analytical engine. Evidence uses it as a universal SQL backend, Rill powers interactive dashboards, and Hex accelerates notebook analytics -- all local-first with optional cloud scaling via MotherDuck.

✈

DuckDB + Arrow Flight

Arrow Flight gRPC

Share DuckDB query results across processes or languages using Apache Arrow Flight. DuckDB's native Arrow support means zero-copy data exchange with any Arrow-compatible tool.

Example: DuckDB + Pandas

DuckDB can query a pandas DataFrame directly by name -- no import or copy step needed:

Python

import duckdb, pandas as pd

df = pd.read_csv('large_file.csv')
result = duckdb.sql("SELECT category, AVG(value) FROM df GROUP BY category").fetchdf()

💡

Zero-copy interop: When DuckDB queries a pandas DataFrame, it reads the underlying NumPy arrays or Arrow buffers directly -- no serialization, no copy. This makes DuckDB an excellent SQL layer for Python data science workflows, especially for aggregations and joins that would be slow in pure pandas.