Guides

How to Get Started with Apache Superset

Arkzero ResearchApr 25, 20266 min read

Last updated Apr 25, 2026

Apache Superset is an open-source business intelligence platform that lets analysts and operations managers build interactive dashboards and explore data without writing code. Released as version 6.0 in December 2025, it connects to dozens of databases including PostgreSQL, MySQL, BigQuery, and Snowflake. You install it locally with Docker Compose, connect a data source, and start building charts in under 30 minutes. Over 1,000 organizations run Superset in production, from startups to large enterprises.
Apache Superset dashboard interface for data analytics

Apache Superset is a free, open-source alternative to Tableau and Power BI. You connect it to your database, upload a CSV, or link a cloud warehouse, and then build charts and dashboards through a point-and-click interface. No SQL expertise required for basic use. Organizations running Superset in production include Airbnb, Twitter, and Nielsen. The tool ships with over 40 chart types and a SQL editor for analysts who want to go deeper.

What You Need Before Starting

Superset version 6.0 (released December 18, 2025) requires:

  • Docker Desktop (version 4.x or later) with Docker Compose v2
  • At least 6 GB of RAM allocated to Docker
  • Git installed locally

You do not need Python or any analytics library installed directly. The Docker image bundles everything.

Install Superset with Docker Compose

Clone the official repository and check out the latest stable tag:

git clone https://github.com/apache/superset.git
cd superset
git checkout tags/6.0.0

Start the full stack with Docker Compose:

docker compose -f docker-compose-image-tag.yml up

This pulls the prebuilt image from Docker Hub, so you skip a local build. On a standard broadband connection the download takes roughly 3 to 5 minutes. Once containers stabilize, open your browser at http://localhost:8088.

Default credentials are admin / admin. Change this immediately in Settings > Security if you plan to expose the instance on a shared network.

If you want to skip the server setup entirely, tools like VSLZ let you upload a CSV or connect a database and start exploring data from a browser tab with no Docker installation required.

Connect a Database

Go to Settings > Database Connections > + Database. Superset 6.0 supports over 40 database engines through SQLAlchemy connection strings. Common choices:

PostgreSQL

postgresql://username:password@host:5432/dbname

MySQL

mysql://username:password@host:3306/dbname

BigQuery requires uploading your service account JSON file through the BigQuery-specific connection panel rather than a raw connection string.

SQLite works with a file path and is useful for local testing:

sqlite:////path/to/your/file.db

Test the connection before saving. Superset runs a SELECT 1 against the target to confirm access. If the connection fails, check that Docker can reach the host (use host.docker.internal instead of localhost on macOS and Windows).

Load a Dataset

After connecting a database, go to Datasets > + Dataset. Select the database, schema, and table. Superset treats each table or view as a dataset. You configure the timestamp column here, which drives time-filter behavior across all charts built on that dataset.

For a CSV, go to Datasets > Upload a CSV. Superset creates a SQLite table from the file and registers it as a dataset automatically. Column types are inferred on import. If a date column comes in as text, click Edit Dataset and set the column type manually.

Calculated columns are another useful feature. In the dataset editor, click + Add calculated column to write SQL expressions like:

revenue - cost

This creates a derived field named margin that appears in the chart builder without modifying your source table.

Build Your First Chart

Open Charts > + Chart, select your dataset, and pick a chart type. The most useful starting types are:

  • Table: shows paginated rows with conditional formatting. Good for any "show me the top 50 customers" request.
  • Bar Chart: categorical comparisons. Drop a dimension into the X axis and a metric into the Y axis.
  • Line Chart: time series. Requires a timestamp dimension on X.
  • Big Number: single KPI figure with sparkline. Common on executive dashboards.

The chart builder uses a drag-and-drop interface. Dimensions go on the X axis or in the group-by panel. Metrics are aggregations: SUM, COUNT, AVG, or custom SQL. Click Update Chart to preview, then Save to add it to a dashboard.

Assemble a Dashboard

Go to Dashboards > + Dashboard. Type a title, then click Edit Dashboard. Drag charts from the right panel onto the canvas. Resize by dragging chart corners. Add markdown text blocks for headers and explanations using the Text component.

Filters are configured in the filter bar (click the filter icon in the top-left of the dashboard). A single date filter applied at the dashboard level controls all time-series charts simultaneously, which is the most requested feature for operations dashboards.

Publish the dashboard with the Draft > Published toggle. Shared links respect role-based access control, so you can give read-only access to a stakeholder without granting chart edit rights.

What Changed in Superset 6.0

The December 2025 release brought three changes that matter for day-to-day use:

Streaming CSV exports. Datasets with more than 100,000 rows previously timed out on export. Version 6.0 streams the export progressively, so a 2 million-row dataset exports cleanly.

MCP integration. Superset can now expose dashboard data to AI agents via the Model Context Protocol. An AI assistant connected to your Superset instance can query datasets, retrieve chart data, and embed results in responses without requiring direct database access.

Drag-and-drop dashboard tabs. Reordering tabs in multi-tab dashboards previously required editing JSON. You now drag tabs to reorder them directly in the editor.

Version 6.1 is in release candidate as of March 2026, with the main addition being expanded Cloudflare D1 support and improved TypeScript coverage across the frontend.

Practical Next Steps

Once your first dashboard is live, two things are worth configuring immediately. First, set up role-based access control in Settings > Security > List Roles to separate admin, analyst, and viewer permissions. Second, enable caching in the Superset config to avoid re-running expensive warehouse queries on every dashboard load. The default Docker Compose setup includes a Redis container for this purpose; you enable it by setting CACHE_CONFIG in superset_config.py.

Superset's GitHub repository has 63,000 stars and an active contributor community. The official Slack workspace (linked from superset.apache.org) is the fastest path to support for setup issues specific to your database or infrastructure.

FAQ

Is Apache Superset free to use?

Yes. Apache Superset is fully open-source under the Apache 2.0 license. There is no paid tier, no usage limits, and no licensing fees. You host and manage it yourself. Preset (preset.io) offers a managed cloud version of Superset with a paid tier if you prefer not to run your own infrastructure.

What databases does Apache Superset support?

Superset supports over 40 database engines via SQLAlchemy drivers. This includes PostgreSQL, MySQL, SQLite, BigQuery, Snowflake, Redshift, Databricks, Trino, Presto, ClickHouse, DuckDB, and Microsoft SQL Server. Each requires the appropriate Python driver installed in the Superset environment. The Docker Compose quickstart includes drivers for the most common databases.

How is Apache Superset different from Metabase?

Both are open-source BI tools, but they target different audiences. Metabase is optimized for non-technical users: it hides SQL by default and makes simple dashboards fast to build. Superset has a steeper setup curve (Docker Compose required) but offers more chart types, a full SQL editor, a semantic layer for calculated metrics, and better support for large datasets. Teams with an analyst or data engineer on staff typically get more from Superset.

Can I upload a CSV file to Apache Superset?

Yes. Go to Datasets > Upload a CSV. Superset creates a SQLite table from the file and registers it as a dataset. Column types are inferred automatically, but you can edit them in the dataset configuration. CSV uploads are best for files under 50 MB; for larger files, loading data directly into a database and connecting Superset to that database is more reliable.

What are the system requirements for Apache Superset?

For the Docker Compose installation, you need Docker Desktop with at least 6 GB of RAM allocated to Docker and Docker Compose v2. The host machine should have at least 4 CPU cores for a smooth experience. Superset 6.0 requires Python 3.10 or higher for source installations. A production deployment typically runs on a VM or Kubernetes cluster with at least 8 GB RAM and 4 vCPUs.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026