Guides

How to Set Up Apache Superset for Analytics

Arkzero ResearchMar 27, 20268 min read

Last updated Mar 27, 2026

Apache Superset is a free, open-source business intelligence platform that lets teams connect databases, build charts, and publish shared dashboards without a Tableau or Power BI subscription. It supports over 40 chart types and connects to more than 40 databases including PostgreSQL, MySQL, BigQuery, and Snowflake. Setup using Docker Compose takes under 15 minutes. Airbnb built it internally in 2015 and donated it to the Apache Software Foundation in 2017.
Apache Superset open-source analytics platform logo

Apache Superset is a free, open-source business intelligence platform that lets teams connect databases, build charts, and publish shared dashboards without a Tableau or Power BI subscription. It supports over 40 chart types and connects to more than 40 databases. Setup using Docker Compose takes under 15 minutes. This guide covers installation, connecting a database, building a chart, and assembling a dashboard, with fixes for the most common problems first-time users encounter.

What Apache Superset Does

Apache Superset started at Airbnb in 2015 as an internal tool for fast data exploration. Airbnb donated it to the Apache Software Foundation in 2017. Today it is one of the most widely deployed open-source BI platforms available. According to Enlyft data, approximately 3,900 organizations use Superset, with the strongest adoption at companies with 50 to 200 employees.

The platform has two primary interfaces. Explore is a point-and-click chart builder where users select a dataset, choose a chart type, and configure axes and filters through dropdown menus without writing SQL. SQL Lab is an in-browser SQL editor available for users who want to write custom queries or build reusable datasets from complex joins. Business users stay in Explore. Analysts use SQL Lab when they need computations not available as pre-built columns.

Supported chart types include bar charts, line charts, area charts, scatter plots, pie charts, world maps, heat maps, time series charts, and pivot tables. Superset connects to PostgreSQL, MySQL, SQLite, Amazon Redshift, Google BigQuery, Snowflake, Microsoft SQL Server, and more than 40 other databases through SQLAlchemy drivers.

Before You Start

Superset requires Docker Desktop. Docker manages all dependencies including Python, Redis, and the Superset web server, so no separate Python installation is required.

Confirm Docker is installed by opening a terminal and running:

docker --version

If it is not installed, download Docker Desktop from docker.com. The free personal tier covers everything needed. Windows users should enable WSL 2 (Windows Subsystem for Linux) during the Docker Desktop installation, which the installer prompts for automatically.

You also need Git:

git --version

Git comes pre-installed on macOS. Windows users can download it from git-scm.com.

System requirements: 8 GB RAM is recommended. Superset runs five Docker containers concurrently including the web server, a worker, a scheduler, Redis, and a PostgreSQL metadata store. Machines with fewer than 6 GB available RAM will experience slow startups or random container restarts.

Step 1: Clone and Start Superset

Open a terminal, navigate to the folder where you want to store Superset, and run these commands in order:

git clone https://github.com/apache/superset.git
cd superset
docker compose -f docker-compose-image-tag.yml up

The first run pulls Docker images and takes 3 to 8 minutes on a typical internet connection. Watch the terminal output. When log lines stop scrolling and a message indicates the web server has started on port 8088, open a browser and go to:

http://localhost:8088

Log in with the default credentials:

  • Username: admin
  • Password: general

Change the admin password immediately after first login via Settings > Security > List Users, click the admin user, then Edit.

Step 2: Connect a Database

Superset reads from databases, not directly from files on disk. The fastest option for getting started is SQLite, which requires no external database server. For production use, connect your existing PostgreSQL, MySQL, or cloud data warehouse.

Click the plus icon in the top navigation and select Data > Connect database. A dialog appears with icons for common databases.

For PostgreSQL, click the PostgreSQL icon and enter the connection string:

postgresql://username:password@host:5432/database_name

Click Test Connection before saving. A green checkmark appears if the connection succeeds.

For SQLite, select SQLite and provide a local file path:

/app/superset_home/example.db

Superset creates the file if it does not exist. This is sufficient for loading small datasets during initial evaluation.

For cloud databases, Redshift, BigQuery, and Snowflake each appear as separate icons in the dialog. BigQuery requires a service account JSON key. Snowflake requires an account identifier found in your Snowflake URL before ".snowflakecomputing.com."

Step 3: Register a Dataset or Upload a CSV

With a database connected, register the table you want to chart. Go to Data > Datasets > Add a new dataset. Select your database, schema, and table from the dropdown menus and click Add. Superset inspects the table and registers it as a dataset available in Explore.

For CSV files, Superset includes a built-in upload feature. Enable it by opening the file docker/pythonpath_dev/superset_config.py in the cloned repository and adding:

ALLOW_CSV_UPLOAD = True
CSV_EXTENSIONS = {"csv"}

Then restart the containers:

docker compose restart

After restart, go to Data > Upload a CSV. Select your file, enter a table name, and choose the SQLite database created in Step 2. Superset imports the data into a new table and registers it as a dataset automatically.

Step 4: Build a Chart

With a dataset registered, go to Charts > Create chart. Select your dataset and a chart type. The Explore view opens.

To build a bar chart showing sales by category:

  1. Set Chart Type to Bar Chart
  2. Under X Axis, select your category column
  3. Under Metrics, click Add metric and choose SUM of your sales column
  4. Click Update Chart to preview

Superset re-renders the chart live. You can change the grouping, adjust the time range, add filters, or switch chart types by clicking the Chart Type selector at the top without losing your other settings. When satisfied, click Save and give the chart a name.

Charts can also originate in SQL Lab. Write and run a query there, then click Explore to open the Explore view with the query result as the data source. This is useful when the calculation you need does not exist as a pre-aggregated column in the source table.

Step 5: Build a Dashboard

Dashboards group multiple charts into a single shareable view. Go to Dashboards > New Dashboard and enter a name.

Open the dashboard and click Edit Dashboard. A right sidebar shows your saved charts. Drag charts onto the canvas. Resize tiles by dragging the lower-right corner. Add text blocks, headers, and dividers from the component panel to organize the layout.

Set dashboard-level filters by clicking Filters in the top bar. A filter widget can control multiple charts simultaneously. Adding a date range filter lets viewers switch between last week, last month, and last quarter without opening each chart individually.

When the layout is ready, click Save. The dashboard is accessible by URL to any user with a Superset login. Share it with team members via Settings > Security > List Users > Add User. Assign the Viewer role to people who should see dashboards but not create or edit charts.

Common Problems and Fixes

Containers take more than 10 minutes to start. Open Docker Desktop > Settings > Resources and increase RAM to at least 6 GB. Reduce other running containers if needed.

Port 8088 is already in use. Edit docker-compose-image-tag.yml and change the port mapping from "8088:8088" to "8089:8088", then access Superset at http://localhost:8089.

CSV upload option is missing from the menu. The config flag is not set. Add ALLOW_CSV_UPLOAD = True and CSV_EXTENSIONS = {"csv"} to docker/pythonpath_dev/superset_config.py, then run docker compose restart.

Charts load but show no data. Open SQL Lab and run SELECT COUNT(*) FROM your_table_name. If the count is zero, the CSV import did not complete. Check the upload log in Data > Upload a CSV for error messages.

When Superset Fits Your Team

Superset works best for teams that already have data in a SQL database and want to give business users a self-service exploration interface without paying for Tableau or Power BI. It is a strong fit when Docker is accessible, someone can handle the initial setup, and the team's data lives in a connected database.

It is less suited for teams whose data lives primarily in spreadsheets or CSV files without a central database. Setting up Docker and a SQLite database to analyze a single file adds overhead that exceeds the task. For those scenarios, VSLZ AI accepts a CSV upload and runs analysis from a plain-English prompt without any infrastructure configuration.

For production deployments beyond a local machine, the official Superset path moves to Kubernetes. Preset, a managed cloud service built on Superset, offers hosted plans that eliminate self-hosting requirements for teams that prefer a fully managed option.

FAQ

Is Apache Superset free to use?

Yes. Apache Superset is licensed under the Apache 2.0 open-source license and is free to use for both commercial and personal purposes with no licensing fees. Infrastructure costs apply if you self-host. A managed hosted version called Preset is available at preset.io for teams that prefer not to run their own server.

What databases does Apache Superset connect to?

Superset connects to more than 40 databases including PostgreSQL, MySQL, SQLite, Amazon Redshift, Google BigQuery, Snowflake, Microsoft SQL Server, MariaDB, Presto, Trino, and others. Connections use SQLAlchemy drivers. The Docker Compose setup pre-installs drivers for the most common databases. Less common databases require manually installing the Python driver inside the container before the connection will succeed.

Does Apache Superset require SQL knowledge?

No. The Explore interface lets users build charts by selecting columns from dropdown menus without writing SQL. Filters, aggregations, and groupings are all point-and-click. SQL Lab, the built-in SQL editor, is optional and available for users who want to write custom queries or build datasets from complex joins. Basic chart and dashboard creation requires no coding.

Can Apache Superset read CSV files directly?

Yes, after enabling one configuration setting. Superset includes a CSV upload feature that imports a file into a SQLite table and registers it as a dataset. The feature is disabled by default. Enable it by adding ALLOW_CSV_UPLOAD = True and CSV_EXTENSIONS = {"csv"} to the superset_config.py file, then restarting the containers. Once enabled, CSV files can be uploaded directly through the Superset interface under Data > Upload a CSV.

How does Apache Superset compare to Metabase?

Both are open-source BI tools with no licensing fees. Superset offers more chart types and greater configurability but has a more complex initial setup and administration overhead. Metabase installs faster, has a simpler interface, and requires less technical knowledge to maintain, making it better suited for small teams without dedicated technical resources. Superset is preferred when more visualization variety, tighter database governance, or enterprise-scale deployment is needed. Both require a database connection and do not support direct file analysis without a setup step.

Related