Skip to main content

KPI Metrics Export Pipeline

Overview

This document outlines the implementation plan to export daily aggregated KPI metrics to BigQuery, making them available for external consumption (BI tools, partners, researchers) without exposing user-level data.

Use cases:

  • Basedash dashboards for marketing team
  • Anonymized data for external partners
  • Research datasets
  • Any reporting tool that can connect to BigQuery

Problem Statement

  • The KPI dashboard generates aggregate metrics on-demand but doesn't persist them
  • External tools (Basedash, Looker, etc.) can only connect to databases, not APIs
  • Partners and researchers need access to anonymized, aggregated data
  • User privacy must be maintained - only aggregate numbers, no PII

Solution Architecture

Data Flow

  1. Trigger: Cloud Scheduler calls /api/export-kpi endpoint daily at 6:00 AM UTC
  2. Compute: eli-kpi fetches metrics from source tables (already implemented in data-fetcher.ts)
  3. Persist: Write aggregated metrics to new BigQuery tables in kpi_metrics dataset
  4. Access: Basedash reads from these tables using existing basedash-reader-key.json

BigQuery Tables Schema

Dataset: eli-health-prod.kpi_metrics

All tables will have a common structure with report_date as the partition key.


Table 1: daily_engagement

Corresponds to "Overall Engagement Metrics" from the dashboard.

ColumnTypeDescription
report_dateDATEDate this metric was recorded (partition key)
period_labelSTRINGHuman-readable period (e.g., "Today (Dec 23)")
period_start_dateDATEStart of the period
period_end_dateDATEEnd of the period
all_testsINT64Total tests (all_measurements)
successful_testsINT64Successful tests (successful_measurements)
successful_test_usersINT64Users with successful tests
active_usersINT64Users who took at least one test in period
total_users_to_dateINT64Cumulative user count
total_active_users_to_dateINT64Cumulative users who have ever tested
active_percentageFLOAT64Active Users / Total Active Users To Date
active_changeFLOAT64Change from previous period (nullable)
avg_tests_per_active_userFLOAT64successful_tests / active_users
created_atTIMESTAMPWhen this row was written

Table 2: daily_advanced_metrics

Corresponds to "Advanced Metrics" from the dashboard.

ColumnTypeDescription
report_dateDATEDate this metric was recorded (partition key)
period_labelSTRINGHuman-readable period
period_start_dateDATEStart of the period
period_end_dateDATEEnd of the period
total_sessionsINT64Total measurement sessions
successful_sessionsINT64Sessions with at least one success
adjusted_success_rateFLOAT64% of sessions that succeeded
first_try_successesINT64Sessions successful on first attempt
first_try_success_rateFLOAT64% first try success
second_try_successesINT64Sessions successful on second attempt
second_try_success_rateFLOAT64% second try success
third_plus_try_successesINT64Sessions successful on 3+ attempts
third_plus_try_success_rateFLOAT64% third+ try success
error_gini_coefficientFLOAT64Error distribution inequality
created_atTIMESTAMPWhen this row was written

Table 3: daily_errors

Corresponds to "Error Data" from the dashboard.

ColumnTypeDescription
report_dateDATEDate this metric was recorded (partition key)
period_labelSTRINGHuman-readable period
period_start_dateDATEStart of the period
period_end_dateDATEEnd of the period
no_period_historyINT64Count of "no period history" errors
camera_too_farINT64Count of "camera too far" errors
camera_too_closeINT64Count of "camera too close" errors
analysis_errorINT64Count of analysis errors
test_lines_not_foundINT64Count of "test lines not found" errors
test_not_foundINT64Count of "test not found" errors
total_errorsINT64Sum of all errors
created_atTIMESTAMPWhen this row was written

Table 4: daily_funnel

Corresponds to "New User Funnel Metrics" from the dashboard.

ColumnTypeDescription
report_dateDATEDate this metric was recorded (partition key)
period_labelSTRINGHuman-readable period
period_start_dateDATEStart of the period
period_end_dateDATEEnd of the period
accounts_createdINT64New accounts created in period
accounts_with_testsINT64New accounts that took a test
conversion_rateFLOAT64accounts_with_tests / accounts_created * 100
created_atTIMESTAMPWhen this row was written

Implementation Steps

Phase 1: BigQuery Setup (eli-devops)

  1. Create BigQuery dataset kpi_metrics in eli-health-prod
  2. Grant access to basedash-reader service account
  3. Create tables with partitioning on report_date

Terraform resources to add:

# In eli-devops/terraform/bigquery.tf (or new file)

resource "google_bigquery_dataset" "kpi_metrics" {
dataset_id = "kpi_metrics"
project = "eli-health-prod"
location = "US"
description = "Daily aggregated KPI metrics for Basedash"

access {
role = "READER"
user_by_email = "basedash-reader@eli-health-prod.iam.gserviceaccount.com"
}
}

resource "google_bigquery_table" "daily_engagement" {
dataset_id = google_bigquery_dataset.kpi_metrics.dataset_id
table_id = "daily_engagement"
project = "eli-health-prod"

time_partitioning {
type = "DAY"
field = "report_date"
}

schema = file("${path.module}/schemas/daily_engagement.json")
}

# Similar for daily_advanced_metrics, daily_errors, daily_funnel

Phase 2: eli-kpi Code Changes

2.1 Create new file: src/bigquery-exporter.ts

import { BigQuery } from '@google-cloud/bigquery';
import { KpiData, ErrorData, AdvancedMetricsData } from './types.js';

const BIGQUERY_PROJECT_ID = 'eli-health-prod';
const KPI_METRICS_DATASET = 'kpi_metrics';

interface ExportResult {
table: string;
rowsInserted: number;
success: boolean;
error?: string;
}

export async function exportDailyEngagement(
reportDate: string,
kpiData: KpiData[]
): Promise<ExportResult> {
const bigquery = new BigQuery({ projectId: BIGQUERY_PROJECT_ID });
const table = bigquery.dataset(KPI_METRICS_DATASET).table('daily_engagement');

const rows = kpiData.map(row => ({
report_date: reportDate,
period_label: row.period,
period_start_date: row.period_start_date,
period_end_date: row.period_end_date,
all_tests: row.all_measurements,
successful_tests: row.successful_measurements,
successful_test_users: row.successful_measurement_users,
active_users: row.active_users,
total_users_to_date: row.total_users_to_date,
total_active_users_to_date: row.total_engaged_users_to_date,
active_percentage: row.engaged_active_percentage,
active_change: row.engaged_active_change,
avg_tests_per_active_user: row.average_tests_per_active_user,
created_at: new Date().toISOString()
}));

try {
await table.insert(rows);
return { table: 'daily_engagement', rowsInserted: rows.length, success: true };
} catch (error) {
return { table: 'daily_engagement', rowsInserted: 0, success: false, error: String(error) };
}
}

export async function exportDailyAdvancedMetrics(
reportDate: string,
advancedData: AdvancedMetricsData[]
): Promise<ExportResult> {
// Similar implementation for advanced metrics
}

export async function exportDailyErrors(
reportDate: string,
errorData: ErrorData[]
): Promise<ExportResult> {
// Similar implementation for error data
}

export async function exportDailyFunnel(
reportDate: string,
kpiData: KpiData[]
): Promise<ExportResult> {
// Calculate conversion rate and export
}

export async function exportAllKpiMetrics(
reportDate: string,
kpiData: KpiData[],
errorData: ErrorData[],
advancedData?: AdvancedMetricsData[]
): Promise<ExportResult[]> {
const results: ExportResult[] = [];

results.push(await exportDailyEngagement(reportDate, kpiData));
results.push(await exportDailyErrors(reportDate, errorData));
results.push(await exportDailyFunnel(reportDate, kpiData));

if (advancedData) {
results.push(await exportDailyAdvancedMetrics(reportDate, advancedData));
}

return results;
}

2.2 Add API endpoint in src/server.ts

// POST /api/export-kpi - Export daily KPIs to BigQuery for Basedash
app.post('/api/export-kpi', async (req, res) => {
try {
const { date } = req.body;
const reportDate = date || new Date().toISOString().split('T')[0];

console.log(`Exporting KPI metrics for ${reportDate}...`);

// Fetch daily metrics (12 days of daily data)
const kpiResult = await fetchKpiData(reportDate, 'day', 12);

// Export to BigQuery
const exportResults = await exportAllKpiMetrics(
reportDate,
kpiResult.kpiData,
kpiResult.errorData,
kpiResult.advancedMetricsData
);

const success = exportResults.every(r => r.success);
const totalRows = exportResults.reduce((sum, r) => sum + r.rowsInserted, 0);

res.json({
success,
reportDate,
results: exportResults,
summary: {
tablesUpdated: exportResults.filter(r => r.success).length,
totalRowsInserted: totalRows
}
});
} catch (error) {
console.error('Error exporting KPI metrics:', error);
res.status(500).json({
success: false,
error: String(error)
});
}
});

2.3 Add CLI command in src/index.ts

program
.command('export')
.description('Export KPI metrics to BigQuery for Basedash')
.option('-d, --date <date>', 'Report date (YYYY-MM-DD)', new Date().toISOString().split('T')[0])
.action(async (options) => {
const { date } = options;
console.log(`Exporting KPI metrics for ${date}...`);

const kpiResult = await fetchKpiData(date, 'day', 12);
const results = await exportAllKpiMetrics(
date,
kpiResult.kpiData,
kpiResult.errorData,
kpiResult.advancedMetricsData
);

console.log('Export results:', results);
});

Phase 3: Cloud Scheduler Setup (eli-devops)

Add to Terraform:

resource "google_cloud_scheduler_job" "kpi_daily_export" {
name = "kpi-daily-export"
description = "Export daily KPI metrics to BigQuery for Basedash"
schedule = "0 6 * * *" # 6:00 AM UTC daily
time_zone = "UTC"
region = "us-east1"

http_target {
http_method = "POST"
uri = "https://kpi.app.eli.health/api/export-kpi"

headers = {
"Content-Type" = "application/json"
}

body = base64encode(jsonencode({
date = "$${TODAY}" # Cloud Scheduler substitution
}))

oidc_token {
service_account_email = "kpi-service-us@eli-health-prod.iam.gserviceaccount.com"
}
}
}

Phase 4: Basedash Configuration

  1. Connect Basedash to BigQuery using existing basedash-reader-key.json
  2. Create dashboards from the new kpi_metrics tables
  3. Set up auto-refresh to pull latest data

Privacy Considerations

Data exported (safe for Basedash):

  • Daily aggregate counts (tests, sessions, errors)
  • Percentages and rates
  • Period labels and dates

Data NOT exported:

  • User emails
  • User IDs
  • Individual test results
  • Image paths
  • Error messages with user context

Testing Plan

  1. Local testing: Run npm run export -- --date 2025-12-23 to verify data export
  2. Verify in BigQuery: Query the kpi_metrics tables to confirm data
  3. Basedash verification: Create a test dashboard and verify data appears correctly

Rollback Plan

If issues occur:

  1. Disable Cloud Scheduler job
  2. Drop or truncate the kpi_metrics tables if needed
  3. No impact on existing KPI dashboard functionality

Timeline

StepDescription
1Create BigQuery dataset and tables (eli-devops)
2Implement bigquery-exporter.ts in eli-kpi
3Add API endpoint and CLI command
4Deploy eli-kpi with new export functionality
5Create Cloud Scheduler job
6Configure Basedash dashboards
7Backfill historical data (optional)

Files to Create/Modify

New Files:

  • eli-kpi/src/bigquery-exporter.ts - Export logic
  • eli-devops/terraform/kpi_metrics.tf - BigQuery resources
  • eli-devops/terraform/schemas/daily_*.json - Table schemas

Modified Files:

  • eli-kpi/src/server.ts - Add /api/export-kpi endpoint
  • eli-kpi/src/index.ts - Add export CLI command
  • eli-kpi/package.json - Add npm run export script

Questions to Resolve

  1. Historical backfill: Should we backfill historical data? If so, how far back?
  2. Retention policy: How long should we keep historical metrics in BigQuery?
  3. Additional metrics: Are there other metrics the marketing team needs?
  4. Alert on failure: Should we alert if the daily export fails?

Approval

  • Reviewed by: _______________
  • Approved by: _______________
  • Implementation start date: _______________