Embedded Analytical Database

DuckDB

DuckDB gives you the analytical power of a columnar data warehouse in a zero-dependency embeddable library that runs anywhere — from a Jupyter notebook to a web browser.

License MIT

Language C++

Latest Version 1.5.1

GitHub duckdb/duckdb

💡 Core Concepts Beginner Key abstractions — columnar storage, vectorized execution, DataChunks, and pipelines. 🏗 Architecture Intermediate System design — parser, binder, optimizer, executor, storage engine, and how they connect. ⚙️ How It Works Intermediate Internal mechanisms — vectorized processing, morsel-driven parallelism, compression, and query optimization. 💻 Implementation Details Advanced Hands-on — getting started, configuration, code patterns, and source code walkthrough. 📊 Use Cases Beginner – Intermediate When to use DuckDB, when not to, and real-world production deployments. 🔌 Ecosystem & Integrations Intermediate Extensions, integrations, MotherDuck, and common tool pairings. ❓ FAQ All Levels Senior-level questions about memory, concurrency, storage, debugging, and scaling. ⚖️ Trade-offs & Limitations Intermediate Honest strengths, limitations, and comparison with SQLite, Polars, and Spark.

Quick Start

Install DuckDB and query a CSV file directly — no loading step required.

python

import duckdb

# Query a CSV file directly — no loading step
result = duckdb.sql("SELECT region, SUM(amount) FROM 'sales.csv' GROUP BY region")
result.show()

Dive into Core Concepts to understand columnar storage and vectorized execution, or jump to Implementation Details for more code patterns.