Guides

How to Analyze Graph Data with BigQuery Graph

Arkzero ResearchApr 26, 20268 min read

Last updated Apr 26, 2026

BigQuery Graph, released in preview in April 2026, lets you model and query relationships in your existing BigQuery data using Graph Query Language (GQL) without moving or copying any data. You define nodes and edges as logical views over your tables, then write GQL pattern-matching queries directly in BigQuery Studio. This guide walks through creating a property graph, writing your first GQL query, and visualizing results for common use cases like fraud detection and supply chain tracing.
Google BigQuery logo on clean background representing BigQuery Graph analytics

BigQuery Graph lets you run graph analysis on data already in BigQuery. You do not need a separate graph database, and your data stays exactly where it is. Google released the feature into public preview on April 14, 2026, with full GQL support and a visual modeler built into BigQuery Studio.

What BigQuery Graph actually does

A graph models data as nodes (entities) and edges (relationships). In a relational database, finding multi-hop relationships, such as accounts that received money from accounts that also received money from a flagged account, requires multiple joins and recursive CTEs that quickly become unreadable. BigQuery Graph handles this natively with GQL pattern syntax.

The key design choice: a property graph in BigQuery is a logical view. When you create one, no data is moved or duplicated. The graph reads directly from your existing tables. You can create it, delete it, and recreate it without touching your source data.

BigQuery Graph implements the ISO GQL standard, which Google helped author. This means the query patterns you write for BigQuery Graph are transferable skills, not a proprietary syntax that locks you in. The standard was published in 2024 and BigQuery is among the first major cloud warehouses to adopt it alongside Spanner.

Prerequisites

  • A Google Cloud project with BigQuery API enabled
  • BigQuery Studio access (the graph modeler and visualization tools live there)
  • Source tables that have clear entity and relationship columns

The feature is in preview and available in US and EU multi-regions. To request access or report issues, email bq-graph-preview-support@google.com.

Step 1: Structure your source tables

BigQuery Graph requires two table types. Node tables represent entities. Edge tables represent relationships between entities and must contain foreign keys pointing back to node table primary keys.

A minimal financial example uses four tables:

  • Person (id, name, city)
  • Account (id, create_time, is_blocked)
  • PersonOwnAccount (id, account_id) -- who owns which account
  • AccountTransferAccount (id, to_id, amount, create_time) -- money movements

If your data already lives in BigQuery, you likely have equivalent tables. The column names do not need to match; you map them when creating the graph.

One important constraint: edge tables must have exactly one source key column and one destination key column that reference primary keys in node tables. If your relationship table has composite keys, you will need to either add a surrogate key column or restructure the edge table definition. The BigQuery graph visual modeler in Studio highlights these mapping requirements visually, which is useful when working through a complex schema for the first time.

Step 2: Create a property graph

Run this DDL in BigQuery Studio or via the BigQuery API:

CREATE OR REPLACE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Person,
    Account
  )
  EDGE TABLES (
    PersonOwnAccount
      SOURCE KEY (id) REFERENCES Person (id)
      DESTINATION KEY (account_id) REFERENCES Account (id)
      LABEL Owns,
    AccountTransferAccount
      SOURCE KEY (id) REFERENCES Account (id)
      DESTINATION KEY (to_id) REFERENCES Account (id)
      LABEL Transfers
  );

This creates the graph as a dataset resource. You can inspect it in BigQuery Studio under your dataset, where it shows node and edge tables with their label mappings. No storage cost is incurred beyond normal table storage.

Step 3: Write GQL queries

GQL uses a GRAPH_TABLE function to run graph pattern queries inside standard SQL. The pattern syntax uses parentheses for nodes and brackets for edges.

Find all accounts owned by a person:

SELECT person_id, account_id
FROM GRAPH_TABLE(
  FinGraph
  MATCH (p:Person)-[:Owns]->(a:Account)
  RETURN p.id AS person_id, a.id AS account_id
);

Detect two-hop money flows (potential layering):

SELECT src_id, intermediate_id, dst_id, total_amount
FROM GRAPH_TABLE(
  FinGraph
  MATCH (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
  WHERE src.id <> dst.id
  RETURN src.id AS src_id,
         mid.id AS intermediate_id,
         dst.id AS dst_id,
         t1.amount + t2.amount AS total_amount
)
ORDER BY total_amount DESC;

This query finds any account that sent money to an intermediate account, which then forwarded money to a third account. In SQL this would require a self-join on the transfers table with aliasing. In GQL, the path pattern makes the intent explicit and the query readable.

Supply chain tracing (find all upstream suppliers for a part):

SELECT part_id, supplier_id, hops
FROM GRAPH_TABLE(
  SupplyGraph
  MATCH (p:Part)<-[:Supplies* 1..5]-(s:Supplier)
  RETURN p.id AS part_id, s.id AS supplier_id,
         LENGTH(MATCH_EDGE_ARRAY()) AS hops
)
ORDER BY hops;

The quantifier * 1..5 traverses up to five hops in one query, something that would require five UNION queries or a recursive CTE in standard SQL.

Step 4: Visualize in BigQuery Studio

After running a GRAPH_TABLE query that returns node and edge data, BigQuery Studio offers a graph view tab alongside the standard table view. Select it to see a force-directed layout of your result set. You can click nodes to inspect properties.

The visual modeler (also in Studio) lets you drag tables onto a canvas and draw edge relationships to generate the CREATE PROPERTY GRAPH statement automatically, which helps when your schema has many tables.

There are a few things the visualization does not yet support in preview: it renders up to 500 nodes per result set before truncating, and it does not support custom color-coding by node property without exporting to a separate tool. For exploratory analysis on moderately sized graphs, the built-in view is sufficient. For large-scale graph visualization, you would need to export query results and use a dedicated tool like Gephi or a notebook with a graph plotting library.

Common use cases

Fraud detection is the most documented application. One financial institution that moved fraud network analysis to a graph approach reported identifying circular transfer rings across 12-hop paths that SQL queries had missed entirely, contributing to roughly 9.1 million pounds in prevented losses. BigQuery Graph makes this analysis available to analysts who already know SQL, without needing a dedicated graph database team.

Customer 360 is another strong fit. Model customers as nodes, purchases and interactions as edges, and run pattern queries to find customers with similar behavior clusters for targeting or churn prediction.

Data lineage tracking is also a natural use case. Map column-level lineage across pipelines by modeling tables as nodes and transformations as edges, then trace upstream dependencies for any output column in one query.

Costs and performance

BigQuery Graph queries are billed the same way as standard BigQuery queries: on bytes processed. The graph definition itself has no ongoing cost. A two-hop GRAPH_TABLE query over a 10-million-row transfers table typically scans the full table, so the same cost optimization rules apply as for regular BigQuery SQL: partition your edge tables by date or region where possible, and use column clustering on the keys you query most.

During the preview period, Google has not published specific performance benchmarks. Early community reports from the Medium series and Google Codelabs suggest that multi-hop queries that previously required 5-minute recursive SQL runs on large datasets can return in under 30 seconds using GQL path quantifiers, primarily because BigQuery can parallelize the graph traversal differently from a recursive CTE.

What BigQuery Graph does not replace

BigQuery Graph runs analytical queries on large datasets. It is not designed for transactional graph workloads that require low-latency single-record lookups at scale. For those workloads, Google offers Spanner Graph. If you need to query the same graph for both analytics and transactional use, you can federate both systems and join results in BigQuery.

Practical summary

BigQuery Graph adds graph pattern matching to a warehouse most data teams already use. Setup takes about 20 minutes if your tables are already in BigQuery: enable the feature, write a CREATE PROPERTY GRAPH statement, and start querying in GQL. The biggest immediate payoff is multi-hop relationship queries that previously required brittle recursive SQL. If you work with financial transactions, operational networks, or customer relationship data, the feature is worth testing during the preview window.

If you want to run ad hoc graph-style questions on a dataset you already have without any setup, VSLZ lets you upload a file and ask relationship questions in plain English, handling the query translation automatically.

FAQ

Does BigQuery Graph move or copy my data?

No. A BigQuery property graph is a logical view over your existing tables. When you create one with CREATE PROPERTY GRAPH, BigQuery stores only the graph definition, not the data. Your source tables remain unchanged and no additional storage is consumed for the graph itself.

What is GQL and how is it different from SQL?

GQL (Graph Query Language) is an ISO standard language for querying property graphs. In BigQuery it is accessed via the GRAPH_TABLE function, which embeds GQL pattern syntax inside a standard SQL query. GQL uses parentheses for nodes and square brackets for edges, letting you express multi-hop path patterns like (a)-[:Transfers]->(b)-[:Transfers]->(c) directly, instead of writing multiple joins or recursive CTEs in SQL.

Is BigQuery Graph available to all BigQuery users?

BigQuery Graph is in public preview as of April 14, 2026, available in US and EU multi-regions. No additional sign-up is required beyond having BigQuery API access enabled in your Google Cloud project. Preview features may have usage limits and SLA differences from GA features. Check the Google Cloud release notes for region expansion updates.

How does BigQuery Graph compare to Neo4j or other dedicated graph databases?

BigQuery Graph is designed for analytical workloads on large datasets using SQL infrastructure you already have. Dedicated graph databases like Neo4j are optimized for transactional, low-latency lookups on smaller graphs. If your primary need is running analytical queries on billions of rows with graph relationships, BigQuery Graph avoids the overhead of maintaining a separate system. If you need sub-millisecond lookups or real-time graph traversal, a dedicated graph database is more appropriate.

Can I use BigQuery Graph with data from Google Sheets or CSV files?

Yes, indirectly. You first need to load the data into BigQuery tables using BigQuery standard ingestion: external tables, load jobs, or streaming inserts. Once the data is in BigQuery tables with node and edge structure, you can define a property graph on top of it. BigQuery supports external tables over Cloud Storage CSVs and Google Sheets, which can serve as the source tables for a property graph.

Related

OpenMetadata data catalog interface showing database schema discovery
Guides

How to Set Up OpenMetadata for Data Discovery

OpenMetadata is an open-source data catalog that gives teams a single place to discover, document, and govern their data assets. Setting it up takes under 30 minutes using Docker: spin up the containers, log into the UI at localhost:8585, then connect your first data source using one of 90+ pre-built connectors. Once ingestion runs, every table, column, and owner is searchable and lineage-linked across your entire stack.

Arkzero Research · Apr 29, 2026
Streamlit logo on a clean white background
Guides

How to Build a Data Dashboard with Streamlit

Streamlit is an open-source Python library that turns a script into a shareable web dashboard without any front-end code. Install it with pip, write a Python file that loads your CSV with pandas, add sidebar widgets for filtering, and render interactive charts with Plotly. Push the file to GitHub, connect it to Streamlit Community Cloud, and anyone with the URL can view live results. No server configuration required.

Arkzero Research · Apr 29, 2026
Airbyte Cloud data integration platform
Guides

How to Set Up Airbyte Cloud for Data Syncing

Airbyte Cloud is a managed data integration platform that syncs data from SaaS tools, databases, and APIs into a central warehouse without requiring Docker, infrastructure, or engineering resources. A free 30-day trial lets you connect sources like Salesforce, HubSpot, Stripe, or Google Sheets to destinations like BigQuery, Snowflake, or Postgres in minutes. This guide walks through the full setup from account creation to your first automated sync.

Arkzero Research · Apr 29, 2026