GA4-Shopify Data Linking
How to connect Google Analytics 4 data with Shopify customer data in BigQuery.
Summary
| Link Method | Coverage | Match Rate | Use Case |
|---|---|---|---|
transaction_id → order.id | 100% of purchases | 94% | Purchase attribution |
user_id → customer.id | 0.59% of events | 98.8% | User behavior tracking |
Architecture Overview
Method 1: Transaction ID (Purchases)
Best for: Purchase attribution, revenue analysis, conversion tracking
The Link
| GA4 Field | Shopify Field | Example |
|---|---|---|
event_params.transaction_id | orders.id | 6062491893805 |
Query Pattern
-- Step 1: Export GA4 transactions (US region)
SELECT DISTINCT
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "transaction_id") as transaction_id,
user_pseudo_id,
event_timestamp
FROM `eli-health-prod.analytics_361776673.events_*`
WHERE event_name = "purchase"
AND _TABLE_SUFFIX >= FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
-- Step 2: Match with Shopify orders (northamerica-northeast1 region)
SELECT
o.id as order_id,
o.order_number,
o.email,
JSON_EXTRACT_SCALAR(o.customer, '$.id') as customer_id,
JSON_EXTRACT_SCALAR(o.customer, '$.first_name') as first_name,
o.total_price,
o.created_at
FROM `eli-health-prod.eli_health_shopify.orders` o
WHERE o.id IN (/* transaction_ids from step 1 */)
Coverage Stats (Last 30 Days)
| Metric | Value |
|---|---|
| GA4 purchase events | 1,834 |
| Matched to Shopify orders | 94% |
| Unmatched (test orders, etc.) | 6% |
Method 2: User ID (Site Behavior)
Best for: User journey analysis, cross-session behavior
The Link
| GA4 Field | Shopify Field | Example |
|---|---|---|
user_id | customers.id | 8614891192365 |
How user_id Gets Set
The user_id is only populated when:
- User is logged into their Shopify account on eli.health
- The website calls
gtag('set', 'user_id', customerId)
Coverage by Event Type
| Event | With user_id | Coverage |
|---|---|---|
download_app_link_click | 31 | 13.2% |
form_submit | 136 | 8.3% |
contact_us_form_submit | 20 | 6.4% |
scroll_depth_90% | 231 | 2.4% |
page_view | 78 | 0.1% |
session_start | 123 | 0.1% |
| purchase | 0 | 0% |
| add_to_cart | 0 | 0% |
Key insight: E-commerce events have 0% user_id because Shopify checkout doesn't pass it.
Query Pattern
-- Step 1: Get GA4 user_ids (US region)
SELECT DISTINCT SAFE_CAST(user_id AS INT64) as customer_id
FROM `eli-health-prod.analytics_361776673.events_*`
WHERE user_id IS NOT NULL AND user_id != ""
AND _TABLE_SUFFIX >= FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
-- Step 2: Match with Shopify customers (northamerica-northeast1 region)
SELECT id, email, first_name, last_name
FROM `eli-health-prod.eli_health_shopify.customers`
WHERE id IN (/* customer_ids from step 1 */)
Match Rate
| Metric | Value |
|---|---|
| Unique GA4 user_ids (30 days) | 259 |
| Matched to Shopify customers | 256 (98.8%) |
Cross-Region Sync
BigQuery cannot directly JOIN tables across regions. GA4 exports to the US multi-region, but Shopify, Klaviyo, and LoopWork data lives in northamerica-northeast1 (Montreal).
Solution: GA4 Cross-Region Sync Cloud Function
A Cloud Function syncs GA4 event tables from US to Montreal daily, enabling direct SQL JOINs.
Sync Details
| Setting | Value |
|---|---|
| Cloud Function | ga4-cross-region-sync |
| Schedule | Daily at 6 AM UTC |
| Method | Export → GCS (Parquet) → Import to Montreal |
| Tables per run | Up to 10 |
| Source dataset | analytics_361776673 (US) |
| Target dataset | analytics_ga4 (northamerica-northeast1) |
| Terraform module | tf/modules/global/ga4-sync/ |
Sync Status
Check which tables are synced:
SELECT MIN(table_name) as oldest, MAX(table_name) as newest, COUNT(*) as total
FROM `eli-health-prod.analytics_ga4.INFORMATION_SCHEMA.TABLES`
WHERE table_name LIKE 'events_%';
Use Cases
1. Purchase Attribution
Question: Which marketing channels drive purchases?
-- GA4 purchase events with UTM parameters
SELECT
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "transaction_id") as transaction_id,
traffic_source.source,
traffic_source.medium,
traffic_source.name as campaign
FROM `eli-health-prod.analytics_361776673.events_*`
WHERE event_name = "purchase"
Then join with Shopify to get customer details and LTV.
2. User Journey Before Purchase
Question: What pages do users view before buying?
-- Get user journey for purchasers (limited by user_id coverage)
SELECT
user_id,
event_name,
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "page_location") as page,
event_timestamp
FROM `eli-health-prod.analytics_361776673.events_*`
WHERE user_id IN (/* purchaser user_ids */)
ORDER BY user_id, event_timestamp
3. Cart Abandonment Analysis
Question: Who added to cart but didn't purchase?
This is already handled by the native Klaviyo-Shopify integration. No GA4 linking needed.
Improving Coverage
To increase user_id coverage on eli.health:
Option 1: Set user_id on Login
// When user logs in via Shopify customer account
gtag('set', 'user_id', shopifyCustomerId);
Option 2: Use Shopify Customer Events API
Shopify's Customer Events API can pass customer ID to GA4.
Option 3: First-Party Data Mode
Use GA4's enhanced conversions to match users post-hoc.
Data Volumes
| Dataset | Records | Update Frequency |
|---|---|---|
| GA4 events (30 days) | 642,213 | Real-time |
| Shopify customers | 43,628 | Daily (Airbyte) |
| Shopify orders | 8,704 | Daily (Airbyte) |
| GA4 user_ids (30 days) | 259 | Real-time |
Marketing Attribution Dashboard
The GA4-Shopify linking powers the Marketing Attribution Dashboard in eli-kpi, available at /marketing?range=7|30|60|90.
Sections (all loaded via AJAX, responding to date range selection)
| Section | Data Source | Purpose |
|---|---|---|
| Summary Stats | GA4 + Shopify | Users, purchasers, revenue, AOV, conversion rate |
| Channel Attribution | GA4 + Shopify | Revenue by source/medium |
| Conversion Funnel | GA4 | Session → cart → checkout → purchase by channel |
| New vs Returning | Shopify | Revenue split by customer type |
| Time to Purchase | GA4 + Shopify | Days from first visit to conversion, with AOV per window |
| US vs Canada | GA4 + Shopify | Geographic revenue breakdown |
| CAC & ROAS | Ads + Shopify | Cost per acquisition, return on ad spend |
| Sessions to Conversion | GA4 | How many sessions before purchase |
| Bounce vs Return | GA4 | Single-session vs multi-session conversion rates |
| High-Intent Abandoners | GA4 | Users who reached cart/checkout but didn't purchase |
| Lost Visitors | GA4 | Users who left without engaging |
| Pages Per Session | GA4 | Depth of engagement vs conversion |
| Multi-Touch Attribution | GA4 | First-touch, last-touch, and assisted attribution |
| New Users by Source | GA4 | First-time visitor acquisition channels |
| Cart Abandonment | GA4 + Shopify | Daily abandoned cart trends and recovery |
| Campaign Matrix | Ads | Spend, impressions, clicks, CTR, CPC by campaign |
| User Flow (Sankey) | GA4 + Shopify | Visual flow: source → landing page → cart → purchase |
URL Parameters
The dashboard supports direct linking via URL parameters:
/marketing?range=7 # Last 7 days
/marketing?range=30 # Last 30 days (default)
/marketing?range=60 # Last 60 days
/marketing?range=90 # Last 90 days
Attribution Model
How User Tracking Works
The dashboard uses two mechanisms to track users:
-
GA4 First-Party Cookie (
_ga→user_pseudo_id): A first-party cookie set by GA4'sgtag.json eli.health. This identifies a browser across sessions. It is not a third-party cookie and is not affected by ITP/Safari restrictions (first-party cookies persist for 7 days in Safari, indefinitely in Chrome). -
UTM Parameters: When users arrive via paid campaigns or email links, UTM parameters (
utm_source,utm_medium,utm_campaign) are captured by GA4 and stored intraffic_source.source,traffic_source.medium, andtraffic_source.name.
Attribution Models Available
| Model | How It Works | Dashboard Section |
|---|---|---|
| Last-Click | Credits the traffic source of the session where the purchase happened | Channel Attribution |
| First-Touch | Credits the traffic source of the user's very first session | Multi-Touch Attribution |
| Assisted | Credits all channels the user touched between first visit and purchase | Multi-Touch Attribution |
The Channel Attribution table uses last-click by default (GA4's traffic_source on the purchase event). The Multi-Touch Attribution section shows all three models side-by-side for comparison.
Revenue Attribution
Revenue figures are real Shopify order totals, not GA4 estimated revenue. The join is:
GA4 purchase event → transaction_id → Shopify orders.id → total_price
This gives 94% match rate. The 6% unmatched are typically test orders or orders that didn't fire GA4 events.
Behavioral Insights
Time to Purchase vs. Average Order Value
Analysis of 30-day data (January 2026) reveals that users who take longer to decide tend to spend more:
| Conversion Window | Purchasers | % of Total | AOV |
|---|---|---|---|
| Same session (< 1 hour) | 1,414 | 81.6% | $133.94 |
| Same day (1-24 hours) | 117 | 6.8% | $137.76 |
| Same week (1-7 days) | 128 | 7.4% | $152.82 |
| Same month (7-30 days) | 73 | 4.2% | $145.45 |
Key findings:
- 82% of purchasers convert in the same session -- the product drives strong impulse conversion
- AOV increases 14% from same-session ($134) to same-week ($153) buyers
- Deliberate buyers (1-7 days) have the highest AOV, suggesting they may be selecting higher-value bundles
- Same-month buyers drop slightly to $145, possibly due to discount-driven re-engagement
Implications for Marketing Strategy
- Top of funnel (ads, landing pages): Should target emotional, fast decision-making. 82% of buyers convert immediately, so the first impression matters most.
- Retargeting (1-7 day window): These users are deliberating and spending more. Retargeting creative should emphasize product value and bundle options rather than urgency.
- Lifecycle communications: Once subscribed, the product experience is analytical (hormone data, trends). Messaging should shift from emotional to informational.
Related Documents
- User Journey Data Architecture - End-to-end data flow including Klaviyo
- Data Pipeline - Overall data architecture
- Klaviyo Integration - Marketing automation
Document History
| Date | Author | Changes |
|---|---|---|
| 2026-01-27 | Chip | Add attribution model docs, behavioral insights (AOV by time-to-purchase), Cloud Function sync, marketing dashboard sections |
| 2026-01-26 | Chip | Initial investigation and documentation |