Guides

How to Set Up Apache Superset with Docker

Arkzero ResearchApr 23, 20268 min read

Last updated Apr 23, 2026

Apache Superset is a free, open-source business intelligence platform that lets teams build interactive dashboards and explore data without writing application code. The fastest way to get it running is with Docker Compose, which sets up the full stack including a metadata database, cache layer, and background workers in a single command. This guide covers the complete setup process, how to connect a real data source, and the first steps to building a working dashboard.
Apache Superset open source BI dashboard running in Docker

Apache Superset is one of the most widely deployed open-source BI tools in production. Originally built at Airbnb in 2015 and donated to the Apache Software Foundation, it now powers analytics for companies ranging from early-stage startups to large enterprises. As of April 2026, the project has more than 62,000 GitHub stars and an active contributor base releasing updates monthly.

Most setup guides either walk through a local pip install, which involves managing Python virtual environments and several manual configuration steps, or assume you already have a production server ready. This guide takes the middle path: a Docker Compose setup that runs the full Superset stack on your machine in under ten minutes, then covers how to connect it to a real database and build a chart you can share with your team.

What You Need Before Starting

You need Docker Desktop installed and running. If you are on macOS, Docker Desktop 4.x or later works without additional configuration. On Linux, install Docker Engine and the Docker Compose plugin separately. On Windows, use Docker Desktop with WSL 2 enabled.

You also need Git. Run git --version to confirm. If it is not installed, download it from git-scm.com.

No Python installation is required. The Docker images handle the runtime environment entirely.

Step 1: Clone the Superset Repository

Open your terminal and run:

git clone --depth=1 https://github.com/apache/superset.git
cd superset

The --depth=1 flag downloads only the latest commit rather than the full history, which cuts the download size significantly.

Next, check out the latest stable release tag. As of April 2026, that is version 4.1.x. You can find the exact latest tag by running:

git tag | sort -V | tail -5

Then check out that tag:

git checkout tags/4.1.2

Pinning to a release tag rather than running from the main branch means you get a tested, stable build rather than work-in-progress code.

Step 2: Start the Stack with Docker Compose

Superset ships with a Docker Compose file designed specifically for local development and evaluation. Run:

docker compose -f docker-compose-image-tag.yml up

This command pulls the official Superset image along with PostgreSQL for metadata storage, Redis for caching, and Celery workers for asynchronous query execution. The first run takes several minutes while images download. Subsequent starts are fast because Docker caches the images locally.

When the output settles and you see lines indicating the web server is listening, Superset is ready. The default address is http://localhost:8088.

What the compose file sets up:

  • superset_app: the main web server on port 8088
  • superset_worker: Celery worker for async queries and chart caching
  • superset_worker_beat: scheduler for periodic jobs
  • db: PostgreSQL 15 storing dashboards, users, and chart definitions
  • redis: Redis 7 handling the task queue and query cache

All five services are configured to talk to each other automatically. You do not need to touch any network or connection string settings to get a working local environment.

Step 3: Log In and Explore the Interface

Navigate to http://localhost:8088. The default credentials are:

  • Username: admin
  • Password: admin

The first thing to do after logging in is change the admin password. Go to Settings in the top right corner, select User Info, and update the password. The default credentials are well-known and should not stay in place even on a local instance if it is accessible beyond your own machine.

Superset's interface has four main sections visible in the top navigation. Dashboards shows your published collections of charts. Charts is where individual visualizations live. Datasets is where you define which tables Superset can query. SQL Lab is a full SQL editor with autocomplete and result export.

A set of example charts and dashboards is loaded automatically on first run. These use a bundled SQLite database with sample data, so you can explore the interface immediately without connecting to an external source.

Step 4: Connect Your Own Database

Superset can connect to more than 40 database types through SQLAlchemy drivers. The Docker image includes drivers for PostgreSQL and MySQL out of the box. For other databases such as Snowflake, BigQuery, Redshift, or DuckDB, you need to install the driver inside the running container.

To add a database connection, go to Settings in the top navigation and select Database Connections. Click the plus icon and choose your database type from the dropdown. Superset will show you the required connection string format for that database.

For a PostgreSQL database, the connection string follows this pattern:

postgresql://username:password@hostname:5432/database_name

For a local PostgreSQL instance running on your machine (not inside Docker), use host.docker.internal as the hostname on macOS and Windows, or your machine's network IP on Linux:

postgresql://myuser:mypassword@host.docker.internal:5432/mydb

After entering the connection string, click Test Connection to verify it works before saving.

Installing additional drivers: If you need a driver that is not included in the base image, exec into the running container and install it:

docker exec -it superset_app bash
pip install snowflake-sqlalchemy

This change is not persistent across container restarts. For a durable setup, create a custom Dockerfile that extends the official Superset image and adds your required packages.

Step 5: Create a Dataset and Build a Chart

Once a database connection is saved, you can register a table as a dataset. Go to Datasets, click the plus icon, select your database and schema, then choose the table. Superset creates a dataset entry that stores column metadata and makes the table available in the chart builder.

With a dataset registered, go to Charts, click the plus icon, and select your dataset. Choose a visualization type from the gallery. Superset offers more than 40 chart types including bar charts, line charts, scatter plots, maps, pivot tables, and time-series forecasts.

The chart builder interface has three panels: the visualization type selector on the left, configuration controls in the center, and a live preview on the right. Set your metric (which column to aggregate and how), your dimension (how to group results), and any filters, then click Create Chart. Save the chart and add it to a dashboard to make it accessible to other users with a shared URL.

Common Issues and How to Fix Them

Port 8088 already in use. Another service is listening on that port. Either stop the conflicting service or change Superset's port by editing the ports entry in the compose file from "8088:8088" to "8090:8088" and restarting.

Database connection refused. When connecting to a database running on your host machine from inside Docker, localhost refers to the container, not your machine. Use host.docker.internal on macOS and Windows. On Linux, add --add-host=host.docker.internal:host-gateway to the Docker run command or set it in the compose file.

Changes not appearing after restart. The compose file mounts a local config directory. If you edited configuration files, run docker compose restart superset_app rather than a full down-and-up cycle to apply changes faster.

Image is out of date. Run docker compose pull before docker compose up to ensure you have the latest version of each image matching the tag in the compose file.

What Superset Does Not Handle Out of the Box

Superset is powerful for teams that already have structured data in a database or warehouse. If your data lives in spreadsheets, CSV exports, or ad-hoc files, you need to load it into a database first before Superset can query it. For exploratory work with raw files, a tool like VSLZ lets you upload a file directly and get charts and analysis without any database setup or connection strings.

For production deployments, the Docker Compose setup used in this guide is not recommended. Apache Superset's documentation covers Helm chart deployment for Kubernetes environments, which adds proper secret management, horizontal scaling, and persistent storage configuration for production use.

Summary

Getting Apache Superset running locally takes three commands: clone the repo, check out a stable tag, and run Docker Compose. From there, connecting a real database and building your first chart takes another ten to fifteen minutes. The result is a fully functional, self-hosted BI platform with no per-seat licensing cost and no data leaving your infrastructure.

FAQ

Is Apache Superset free to use?

Yes. Apache Superset is released under the Apache 2.0 license, which means it is free to use, modify, and distribute for any purpose including commercial use. There is no paid tier or feature gating in the open-source version. Preset.io offers a managed hosted version with paid plans for teams that want cloud hosting and enterprise support without managing the infrastructure themselves.

Can Apache Superset connect to Google Sheets or Excel files?

Superset connects to databases and data warehouses through SQLAlchemy drivers. It does not have a built-in connector for Google Sheets or Excel files directly. To use spreadsheet data, you would first need to load it into a supported database such as PostgreSQL, SQLite, or a cloud warehouse. The Google Sheets API can be accessed through a third-party SQLAlchemy driver, but this requires additional configuration.

What databases does Apache Superset support?

Superset supports more than 40 databases through SQLAlchemy, including PostgreSQL, MySQL, SQLite, Snowflake, BigQuery, Redshift, Databricks, DuckDB, ClickHouse, Presto, Trino, and many others. Support for each database requires the corresponding Python driver to be installed in the Superset environment. The official documentation lists every supported database and the required pip package for each one.

How do I update Apache Superset to a newer version?

For a Docker Compose setup, update by pulling the newer image tag. Check the Superset releases page on GitHub for the latest stable tag, update the image tag references in your compose file or environment variable, run `docker compose pull` to download the new images, then run `docker compose up` to restart with the updated version. Always check the release notes for migration steps before upgrading, as some versions require database schema migrations.

What is the difference between Superset and Metabase?

Both are open-source BI tools that connect to databases and let users build dashboards without writing application code. Superset is more technically oriented, with a full SQL editor, support for custom SQL metrics, and a wider range of chart types. Metabase is designed for non-technical users with a simpler question-building interface and easier initial setup. Superset requires more configuration to get running but gives data teams more control over how data is defined and queried.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026