Guides

How to Get Started with Kestra

Arkzero ResearchApr 24, 20266 min read

Last updated Apr 24, 2026

Kestra is an open-source, YAML-based workflow orchestrator that lets ops managers and analysts automate data pipelines without writing Python code. You install it with a single Docker command, define tasks in a plain-text YAML file, and trigger them on a schedule or from an event like a file upload. Most teams have their first automated pipeline running within an hour of installation.
Workflow orchestration pipeline automation — How to Get Started with Kestra

What Is Kestra and Why Teams Are Switching in 2026

Kestra is a declarative workflow orchestration platform. Where traditional tools like Apache Airflow require you to write Python DAGs and manage a complex Python environment, Kestra lets you define workflows in YAML — a format that operations managers, analysts, and even product teams can read and edit without engineering support.

In March 2026, Kestra published data showing that over 80% of organizations using AI-driven data platforms were moving toward event-driven, YAML-first pipeline tools. The shift reflects a broader trend: data work is no longer the exclusive domain of data engineers. Analysts and ops teams now own pipelines that were previously delegated.

Kestra has over 1,200 built-in plugins covering databases, cloud storage, REST APIs, dbt, Airbyte, and more — which means most standard data pipeline tasks require no custom code at all.

Step 1: Install Kestra with Docker

You need Docker installed. If you don't have it, download Docker Desktop from docker.com and follow the standard installation for your operating system.

Once Docker is running, open a terminal and run:

docker run --pull=always --rm -it -p 8080:8080 --user=root \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /tmp:/tmp kestra/kestra:latest server local

This pulls the latest Kestra image and starts the server locally. After about 30 seconds, open a browser and navigate to http://localhost:8080. You will see the Kestra UI with the flow editor, execution history, and plugin library.

For production deployments, Kestra offers a Docker Compose configuration and Kubernetes Helm chart. The local single-container setup is sufficient to build and test any workflow before moving to production.

Step 2: Create Your First Flow

In the Kestra UI, click Flows in the left sidebar, then + Create. You'll see a YAML editor. A flow has three required parts: a namespace, an id, and at least one task.

Here is a minimal working example that fetches data from a public API and logs the result:

id: fetch-daily-report
namespace: ops.data

tasks:
  - id: get_data
    type: io.kestra.plugin.core.http.Request
    uri: https://api.example.com/daily-summary
    method: GET

  - id: log_result
    type: io.kestra.plugin.core.log.Log
    message: "{{ outputs.get_data.body }}"

Click Save and then Execute to run it immediately. The execution panel shows real-time logs for each task, including inputs, outputs, duration, and any errors.

You can reference the output of one task in the next task using the {{ outputs.task_id.field }} syntax. Kestra handles the dependency chain automatically.

Step 3: Connect to a Database

Most pipelines read from or write to a database. Kestra has native plugins for PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, and more.

To query a PostgreSQL database:

id: daily-sales-extract
namespace: ops.data

tasks:
  - id: query_sales
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: jdbc:postgresql://your-host:5432/your-database
    username: "{{ secret('DB_USER') }}"
    password: "{{ secret('DB_PASS') }}"
    sql: |
      SELECT date, SUM(revenue) as total_revenue
      FROM sales
      WHERE date = CURRENT_DATE - 1
      GROUP BY date
    store: true

Secrets like database passwords are stored in Kestra's secrets manager (or an external vault), not hardcoded in the YAML. The store: true flag saves the query output as an internal storage file that downstream tasks can reference.

Step 4: Add a Schedule

A flow without a trigger runs only when you execute it manually. To run on a schedule, add a triggers block:

triggers:
  - id: daily_at_8am
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "0 8 * * *"

This runs the flow every day at 8 AM UTC. Cron expressions follow standard syntax. You can also trigger on S3 file uploads, webhook calls, another flow completing, or Kafka messages.

Kestra stores the execution history for every triggered run, so you can go back and inspect what ran, what the inputs were, and where it failed — without digging through log files.

Step 5: Handle Errors and Retries

Production pipelines fail. A database goes down, an API rate-limits you, a file arrives late. Kestra handles this with task-level retry configuration:

tasks:
  - id: fetch_from_api
    type: io.kestra.plugin.core.http.Request
    uri: https://api.example.com/data
    retry:
      type: exponential
      maxAttempts: 3
      initialDelay: PT30S
      maxDelay: PT5M

This retries up to 3 times with exponential backoff, starting at 30 seconds and capping at 5 minutes. You can also define errors tasks that run only when a pipeline fails — useful for sending a Slack alert or writing a failure record to a database.

Step 6: Use Namespaces to Organize Workflows

As you add more flows, namespaces keep things organized. Think of them like folders. A common structure:

ops.data       — data extraction and loading flows
ops.reports    — scheduled reporting flows
ops.alerts     — monitoring and alerting flows
marketing.etl  — marketing data pipelines

Kestra lets you set namespace-level variables and secrets, so flows within the same namespace share configuration without duplicating it.

Analyzing Pipeline Output

Once your flows are running and producing data, the next step is making sense of that output. If your pipeline exports CSV files or writes to a database, tools like VSLZ let you upload the output and query it in plain English — useful for non-technical stakeholders who need to see the results without writing SQL.

What to Build First

For teams moving off manual spreadsheet workflows or replacing ad-hoc scripts, a good starting point is a daily extract-load flow: query your operational database at a set time, write the output to a data warehouse or a shared CSV, and trigger a Slack notification when it completes. That single flow typically replaces several hours of weekly manual work and gives you a foundation to build more complex pipelines on top of.

Kestra's blueprint library, available in the UI under Blueprints, includes pre-built templates for dbt runs, Airbyte syncs, BigQuery loads, and API polling patterns. Start from a blueprint rather than writing from scratch.

FAQ

Does Kestra require coding experience to use?

No. Kestra workflows are written in YAML, which is a plain-text configuration format. Most data tasks — database queries, API calls, file transfers, scheduled jobs — are handled by pre-built plugins that only need configuration values, not code. For tasks that do require custom logic, Kestra supports Python and JavaScript scripts embedded directly in the flow.

How does Kestra compare to Apache Airflow?

Airflow requires Python DAG files and a Python environment, which creates friction for non-engineers and introduces version compatibility issues. Kestra uses YAML for all workflow definitions, has a built-in UI for editing and monitoring, and ships with 1,200+ plugins out of the box. Kestra also supports event-driven triggers (file uploads, webhooks, Kafka messages) more naturally than Airflow's DAG-centric model.

Can I run Kestra without Docker?

The recommended way to run Kestra locally is with Docker, since it handles all dependencies automatically. For production, Kestra supports Docker Compose for multi-container deployments and Kubernetes via a Helm chart. Kestra also offers a managed cloud version that removes all infrastructure setup.

How do I handle secrets like API keys and database passwords in Kestra?

Kestra has a built-in secrets manager accessible via the UI. You store secrets there and reference them in YAML using the {{ secret('KEY_NAME') }} syntax. For production environments, Kestra integrates with AWS Secrets Manager, GCP Secret Manager, Vault, and Azure Key Vault, so secrets are never stored in the YAML files or version control.

What happens when a Kestra workflow fails mid-run?

Kestra records the failure in its execution history with the exact task that failed, the error message, and all inputs and outputs up to that point. You can configure automatic retries at the task level with exponential backoff. You can also define error-handling tasks that run on failure — for example, sending a Slack notification or writing a failure record to a database. Failed executions can be re-run manually from the point of failure without restarting the entire flow.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026