Guides

How to Set Up a Microsoft Fabric Data Agent

Arkzero ResearchApr 8, 20268 min read

Last updated Apr 8, 2026

Microsoft Fabric Data Agents let business users ask plain English questions against enterprise data -- including Power BI semantic models, lakehouses, and warehouses -- and receive answers without writing SQL or DAX. The feature requires F2 or higher Fabric capacity, six tenant settings enabled by an admin, and at least one structured data source already in OneLake. Setup takes under an hour end to end. As of early 2026, the core feature is available in Fabric workspaces, with Copilot Studio integration in preview.
Microsoft logo on a clean background representing the Fabric Data Agent setup guide

A Microsoft Fabric Data Agent connects your business users directly to structured enterprise data. They ask a question in plain English; the agent generates SQL, DAX, or KQL behind the scenes, runs it against your connected data sources, and returns an answer. No dashboard-building required, no query language needed. Setup takes under an hour once the prerequisites are in place: F2 or higher Fabric capacity, six admin tenant settings enabled, and at least one structured data source in OneLake.

Prerequisites

Two things must be in place before creating an agent.

Fabric capacity: Your organization needs F2 capacity at minimum (or Power BI Premium P1 or higher with Fabric enabled). F1 and free trial capacities do not support Fabric Data Agents. Check the Admin Portal under Capacity Settings if you are unsure which capacity your workspace uses.

Admin access: A Fabric or Power BI admin must enable six tenant settings before the feature functions. These live in the Admin Portal, accessible through the gear icon in the top-right of Fabric. Plan for up to one hour of propagation time after enabling each setting.

Structured data: You need at least one data source already available in Fabric. Supported types are Lakehouse tables, Warehouses, Power BI semantic models, KQL databases, Ontologies, and Microsoft Graph. Raw CSV files stored in a Lakehouse file store are not supported. Load them into Lakehouse tables first.

Step 1: Enable the Required Tenant Settings

Open Admin Portal > Tenant Settings and turn on the following:

  1. Users can use Copilot and other features powered by Azure OpenAI -- required for all AI capabilities.
  2. Capacities can be designated as Fabric Copilot capacities -- required to assign your capacity for AI workloads.
  3. Data sent to Azure OpenAI can be processed outside your capacity's geographic region -- required if your capacity is not in the EU data boundary or the US.
  4. Data sent to Azure OpenAI can be stored outside your capacity's geographic region -- same geographic condition as above.
  5. Conversation history stored outside your capacity's geographic region -- required for persistent conversational context. History is retained for 28 days and users can clear it at any time.
  6. Allow XMLA endpoints and Analyze in Excel with on-premises datasets -- found under Integration Settings. Required only if connecting Power BI semantic models as data sources.

Settings propagate within one hour. Once active, return to your Fabric workspace to create the agent.

Step 2: Create a Data Agent in Your Workspace

  1. Open your Fabric workspace.
  2. Select + New Item.
  3. In the All Items tab, search for Fabric data agent.
  4. Select it, give the agent a name, and confirm.

The OneLake catalog opens automatically. This is where you connect data sources.

Step 3: Add Data Sources and Select Tables

Each agent supports up to five data sources in any combination. From the OneLake catalog, select a data source and click Add. To add more after the first, use + Data source in the left Explorer pane.

After connecting a source, the Explorer pane displays its available tables. Use the checkboxes to specify which tables the agent can query. Tighter table selection tends to produce more accurate responses.

Table naming matters more than most admins expect. Microsoft's documentation notes that descriptive names like SalesRevenueByRegion produce more reliable generated queries than generic names like Table1. If your existing tables have generic names, creating views with descriptive names is a practical workaround.

The query language the agent uses depends on the data source:

  • Lakehouses and Warehouses generate SQL
  • Power BI semantic models generate DAX
  • KQL databases generate KQL

For Power BI semantic models, users only need Read permission on the semantic model. Workspace membership and Build permission are not required, which simplifies sharing with non-technical stakeholders.

Step 4: Configure Instructions and Example Queries

Two configuration options improve accuracy.

Agent instructions (up to 15,000 characters): Natural language guidance written by you. Use this to define routing logic across data sources (for example, "for financial metrics use the semantic model; for raw transaction data use the lakehouse"), clarify domain-specific terminology and abbreviations, and set response format expectations.

Example queries: Provide sample question and SQL or KQL pairs per data source. The agent retrieves the three most relevant examples when answering a new question. Up to 100 example pairs per data source are supported. This option is not available for Power BI semantic models or Ontologies.

Both settings are optional, but agents without configuration struggle with ambiguous questions. Writing five to ten representative question-query pairs for common request types substantially improves first-run accuracy with no additional infrastructure cost.

Step 5: Test, Publish, and Share

The built-in chat interface lets you test the agent before publishing. Ask questions representative of what your team would send. The agent shows its reasoning and the underlying generated query for each response, making it straightforward to identify where instructions or examples need adjustment.

When the agent is behaving as expected, click Publish. Write a clear description covering the agent's purpose, the data it covers, and the question types it handles well. This description is used by Copilot Studio and other orchestration layers to invoke the agent correctly.

Publishing creates two separate versions: a live published version and an editable draft. Share the published version with colleagues. Consumers need Read permission on the agent and Read permission on the underlying data sources. Row-Level Security and Column-Level Security configured on Power BI semantic models still apply, even when accessed through the agent.

For teams that need version control and promotion across environments, Fabric's Git integration can track agent instructions, example queries, and data source selections. Deployment pipelines support promotion from dev to test to production workspaces.

Connecting to Copilot Studio (Preview)

If your organization uses Microsoft Teams, you can surface the agent through a Copilot Studio bot. This integration is in preview as of April 2026 and requires a Microsoft 365 Copilot license for each person building or managing agents in Copilot Studio.

Setup: in Copilot Studio, create or open an agent, go to Agents > + Add, select Microsoft Fabric, and connect the published Fabric data agent. Enable generative AI orchestration under the agent's Settings panel. Publish the Copilot Studio agent to your Teams channel.

One current constraint: the combined deployment cannot be published to Microsoft 365 Copilot. Teams is supported; M365 Copilot is not, as of April 2026.

Key Limitations

Response size cap: Responses are capped at 25 rows and 25 columns. The agent summarizes or truncates beyond that. For large-dataset exploration, this is a real constraint. Starting a new chat session rather than continuing a long thread produces better results, since prior conversation history consumes context.

English only: Questions, agent instructions, and example queries must all be in English.

Structured data only: The agent cannot query raw files. PDFs, DOCX files, and CSV files not yet loaded into Lakehouse tables are outside scope.

Read-only: Only SELECT-equivalent operations are supported. The agent does not write, update, or delete data.

Single region: The agent and its data sources must be in the same geographic region. A lakehouse in North Europe fails if the agent's workspace capacity is in France Central.

Five data source limit: You can connect up to five sources per agent. Organizations with many data silos may need multiple purpose-specific agents.

If your team works across multiple file types and needs answers without setting up Fabric infrastructure, tools like VSLZ handle this from a direct file upload with no workspace configuration required.

Summary

A Fabric Data Agent reduces the gap between enterprise data and the business users who need answers from it. The setup path is predictable: enable six tenant settings, create the artifact in a workspace, connect up to five data sources, write instructions and example queries, then publish and share. The Copilot Studio integration extends that access into Teams without requiring SQL or DAX knowledge on the consuming end. The main constraints to plan around are the 25-row response cap, the English-only requirement, and the five data source limit per agent.

FAQ

What license do you need for a Microsoft Fabric Data Agent?

You need at least F2 Fabric capacity or Power BI Premium P1 or higher with Microsoft Fabric enabled on that capacity. F1 capacity and free trial capacities are not supported. For Copilot Studio integration, each agent builder also needs a Microsoft 365 Copilot license. An admin must enable six specific tenant settings in the Fabric Admin Portal before the feature functions.

Can a Fabric Data Agent connect to Power BI reports and dashboards?

A Fabric Data Agent connects to Power BI semantic models, not to reports or dashboards directly. It generates DAX queries against the semantic model and returns data in text form. Users accessing the agent only need Read permission on the semantic model -- Build permission and workspace membership are not required. Row-Level Security and Column-Level Security configured on the semantic model still apply.

How many data sources can a Fabric Data Agent use?

Each Fabric Data Agent supports up to five data sources in any combination: Lakehouse tables, Warehouses, Power BI semantic models, KQL databases, Ontologies, and Microsoft Graph. Organizations with many data domains may need multiple purpose-specific agents rather than a single all-encompassing one.

Can you deploy a Fabric Data Agent to Microsoft Teams?

Yes. You can connect a published Fabric Data Agent to a Copilot Studio agent and deploy that Copilot Studio agent to Microsoft Teams. This integration is in preview as of April 2026 and requires a Microsoft 365 Copilot license for the agent builder. The combined deployment cannot currently be published to Microsoft 365 Copilot -- Teams is the supported channel.

What is the row limit for Fabric Data Agent responses?

Fabric Data Agent responses are capped at 25 rows and 25 columns. Larger result sets are summarized or truncated by the agent. For queries likely to return large datasets, refine the question to be more specific or use follow-up questions to drill down. Starting a fresh chat session also helps, as long conversation histories consume context and can reduce response completeness.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026