Guides

How to Get Started with MotherDuck

Arkzero ResearchApr 23, 20267 min read

Last updated Apr 23, 2026

MotherDuck is a serverless cloud data warehouse built on DuckDB. You connect with one line of Python, get 10 GB free with no credit card, and the query planner routes each operation between your laptop and the cloud based on where data lives. Teams use it to share live databases without exporting files and to query S3 directly in SQL.
A professional editorial scene representing cloud data analytics and MotherDuck

MotherDuck extends DuckDB into the cloud. If you already use DuckDB to query CSV or Parquet files locally, MotherDuck adds three things: cloud persistence (databases live in MotherDuck rather than only on your machine), sharing (teammates can connect to the same database), and cloud compute for queries too large for a laptop. The underlying engine is still DuckDB, so any SQL you already write works without modification.

What Is Dual Execution

The core technical feature of MotherDuck is dual execution. When you attach to MotherDuck, the query planner inspects where your data lives and routes each stage of the query to the best location. Data on your laptop is processed locally with zero cloud compute cost. Data in MotherDuck or on S3, GCS, or Azure Blob is processed in MotherDuck's cloud. You can join a local file with a cloud table in a single SQL statement.

Use EXPLAIN in front of any query to see how it will be routed:

EXPLAIN SELECT COUNT(*) FROM my_db.sales;

The output labels each operation as local or remote. In practice, most users never need to think about routing. It happens automatically.

Create a MotherDuck Account

Go to app.motherduck.com and sign up with GitHub, Google, or email. No credit card is required for the free tier, which includes 10 GB of storage and a monthly compute allowance. After signing in, copy your service token from Settings, then Tokens.

Connect from Python

Install DuckDB 0.10 or later:

pip install duckdb

MotherDuck support is built into recent DuckDB releases. Connect using your token:

import duckdb

con = duckdb.connect('md:?motherduck_token=your_token_here')

A cleaner approach stores the token as an environment variable so it never appears in code:

export motherduck_token=your_token_here
import duckdb

con = duckdb.connect('md:')

Run SHOW DATABASES; after connecting to confirm the connection. You will see your MotherDuck databases alongside any local databases you have attached.

Connect from the DuckDB CLI

If you prefer the command line, the DuckDB CLI works the same way:

duckdb md:

Set motherduck_token in your shell environment first. The CLI will prompt you to authenticate if the token is missing.

Create Your First Cloud Database

con.execute("CREATE DATABASE IF NOT EXISTS analytics;")
con.execute("USE analytics;")

Load a local CSV file into the cloud database:

con.execute("""
    CREATE TABLE sales AS
    SELECT * FROM read_csv_auto('/path/to/sales.csv')
""")

The table is now stored in MotherDuck. Close Python, reopen it, reconnect to md:, and the table is still there. This is the key difference from local DuckDB: data persists without keeping a local file open.

Load a Parquet file from a URL:

con.execute("""
    CREATE TABLE orders AS
    SELECT * FROM 'https://example.com/orders.parquet'
""")

Query S3 Without Loading Data First

MotherDuck can run SQL directly against files on Amazon S3, Google Cloud Storage, and Azure Blob Storage. For S3, create a secret with your AWS credentials:

con.execute("""
    CREATE SECRET IF NOT EXISTS aws_creds IN MOTHERDUCK (
        TYPE S3,
        KEY_ID 'your_access_key',
        SECRET 'your_secret_key',
        REGION 'us-east-1'
    )
""")

Secrets created with IN MOTHERDUCK persist across sessions, so you set them once. Now query S3 directly:

df = con.execute("""
    SELECT region, SUM(revenue) AS total
    FROM 's3://your-bucket/sales/*.parquet'
    GROUP BY region
    ORDER BY total DESC
""").df()

If you query the same S3 data repeatedly, load it into MotherDuck once with CREATE TABLE AS SELECT. Storage in MotherDuck costs roughly $0.08 per GB-month, and subsequent queries run from cloud storage rather than re-scanning S3.

Share a Database

MotherDuck databases can be shared with teammates or via a public link. In the web UI at app.motherduck.com, navigate to your database, click Share, and choose between sharing with specific MotherDuck users or generating a read-only access link.

A teammate connects to a shared database:

con.execute("ATTACH 'md:_share/the_share_token' AS shared_db;")
results = con.execute("SELECT * FROM shared_db.sales LIMIT 100;").df()

Sharing removes the CSV email workflow entirely. The data lives in one place; everyone queries the live version.

Use the Web SQL Editor

The web UI at app.motherduck.com includes a SQL editor, database browser, schema explorer, and query history. For analysts who do not use Python, the web editor is the fastest path to getting started. Paste a local file into the UI, create a table, and start querying with no local setup.

The editor also shows per-query compute time. Since MotherDuck bills per second of compute on paid plans, the UI helps identify expensive queries before they accumulate charges.

Pricing in 2026

MotherDuck restructured its pricing in early 2026. The previous $25/month Lite plan was retired. The current tiers are:

  • Free tier: 10 GB storage and a monthly compute allowance, no credit card required
  • Business plan: $250/month (raised from $100/month), designed for teams
  • Compute: billed per second of DuckDB instance time on paid plans, with granular per-second billing and zero cost during idle periods

For a solo analyst or small team with data under 10 GB, the free tier covers most workflows. According to community discussion following the pricing change, the jump from free to Business at $250/month is the main friction point for teams that had relied on the discontinued $25 plan.

If you want to analyze smaller uploaded datasets in plain English without managing a warehouse at all, VSLZ AI lets you upload a file and ask questions directly, skipping the connection and configuration step.

When to Use MotherDuck vs. Local DuckDB

Local DuckDB is faster for data that lives on your machine. MotherDuck makes sense when:

  • You need queries and tables to persist without keeping a local file open
  • Multiple people need to query the same data
  • Your data lives in S3 or another cloud store and you want direct SQL access
  • You want to share a live dataset with a link instead of exporting files

The dual execution model means you do not have to choose one or the other. Connect to MotherDuck and query local files at local speed. The query planner handles the rest.

Practical Tips for First-Time Users

Store your token in an environment variable, not in scripts or notebooks. Use EXPLAIN before running large S3 scans to confirm routing. If you are joining a local file with a cloud table, load the local file first using read_csv_auto or read_parquet and alias it rather than attaching a local database. For teams on the free tier approaching the 10 GB limit, use CREATE TABLE AS SELECT to consolidate redundant tables rather than storing raw and aggregated versions separately.

What to Do Next

After your first database is running, look into the dbt integration. MotherDuck maintains support for the dbt-duckdb adapter with a MotherDuck configuration target, which lets you run dbt models against cloud databases. The full setup is documented at motherduck.com/docs/integrations/dbt. For teams already using dbt Core with a local DuckDB target, switching the target to MotherDuck takes roughly five minutes.

FAQ

Is MotherDuck free to use?

Yes. MotherDuck offers a free tier with 10 GB of storage and a monthly compute allowance. No credit card is required to sign up. The free tier is suitable for solo analysts and small projects. Paid plans start at $250/month for teams.

How is MotherDuck different from local DuckDB?

Local DuckDB stores databases as files on your machine and processes queries locally. MotherDuck stores databases in the cloud so they persist across sessions, allows multiple users to query the same database, and can run compute in the cloud for large queries. Both use the same SQL dialect and DuckDB engine.

Can MotherDuck query files on Amazon S3?

Yes. You can create an S3 secret in MotherDuck once, and then query Parquet, CSV, or JSON files on S3 directly with SQL without loading the data first. MotherDuck routes S3 queries to its cloud compute engine automatically.

Does MotherDuck work with Python?

Yes. MotherDuck works through the standard DuckDB Python library (duckdb package, version 0.10 or later). You connect with duckdb.connect('md:') after setting your token as an environment variable. All DuckDB Python APIs work as normal once connected.

What is dual execution in MotherDuck?

Dual execution is MotherDuck's query routing system. The query planner automatically decides whether each part of a query runs on your local DuckDB instance or on MotherDuck's cloud compute. Queries on local data run locally at zero compute cost. Queries on cloud databases or S3 data run in MotherDuck's cloud. You can join local and cloud data in a single query.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026