KPI Metrics Export Pipeline

Overview

This document outlines the implementation plan to export daily aggregated KPI metrics to BigQuery, making them available for external consumption (BI tools, partners, researchers) without exposing user-level data.

Use cases:

Basedash dashboards for marketing team
Anonymized data for external partners
Research datasets
Any reporting tool that can connect to BigQuery

Problem Statement

The KPI dashboard generates aggregate metrics on-demand but doesn't persist them
External tools (Basedash, Looker, etc.) can only connect to databases, not APIs
Partners and researchers need access to anonymized, aggregated data
User privacy must be maintained - only aggregate numbers, no PII

Solution Architecture

Data Flow

Trigger: Cloud Scheduler calls /api/export-kpi endpoint daily at 6:00 AM UTC
Compute: eli-kpi fetches metrics from source tables (already implemented in data-fetcher.ts)
Persist: Write aggregated metrics to new BigQuery tables in kpi_metrics dataset
Access: Basedash reads from these tables using existing basedash-reader-key.json

BigQuery Tables Schema

Dataset: `eli-health-prod.kpi_metrics`

All tables will have a common structure with report_date as the partition key.

Table 1: `daily_engagement`

Corresponds to "Overall Engagement Metrics" from the dashboard.

Column	Type	Description
report_date	DATE	Date this metric was recorded (partition key)
period_label	STRING	Human-readable period (e.g., "Today (Dec 23)")
period_start_date	DATE	Start of the period
period_end_date	DATE	End of the period
all_tests	INT64	Total tests (all_measurements)
successful_tests	INT64	Successful tests (successful_measurements)
successful_test_users	INT64	Users with successful tests
active_users	INT64	Users who took at least one test in period
total_users_to_date	INT64	Cumulative user count
total_active_users_to_date	INT64	Cumulative users who have ever tested
active_percentage	FLOAT64	Active Users / Total Active Users To Date
active_change	FLOAT64	Change from previous period (nullable)
avg_tests_per_active_user	FLOAT64	successful_tests / active_users
created_at	TIMESTAMP	When this row was written

Table 2: `daily_advanced_metrics`

Corresponds to "Advanced Metrics" from the dashboard.

Column	Type	Description
report_date	DATE	Date this metric was recorded (partition key)
period_label	STRING	Human-readable period
period_start_date	DATE	Start of the period
period_end_date	DATE	End of the period
total_sessions	INT64	Total measurement sessions
successful_sessions	INT64	Sessions with at least one success
adjusted_success_rate	FLOAT64	% of sessions that succeeded
first_try_successes	INT64	Sessions successful on first attempt
first_try_success_rate	FLOAT64	% first try success
second_try_successes	INT64	Sessions successful on second attempt
second_try_success_rate	FLOAT64	% second try success
third_plus_try_successes	INT64	Sessions successful on 3+ attempts
third_plus_try_success_rate	FLOAT64	% third+ try success
error_gini_coefficient	FLOAT64	Error distribution inequality
created_at	TIMESTAMP	When this row was written

Table 3: `daily_errors`

Corresponds to "Error Data" from the dashboard.

Column	Type	Description
report_date	DATE	Date this metric was recorded (partition key)
period_label	STRING	Human-readable period
period_start_date	DATE	Start of the period
period_end_date	DATE	End of the period
no_period_history	INT64	Count of "no period history" errors
camera_too_far	INT64	Count of "camera too far" errors
camera_too_close	INT64	Count of "camera too close" errors
analysis_error	INT64	Count of analysis errors
test_lines_not_found	INT64	Count of "test lines not found" errors
test_not_found	INT64	Count of "test not found" errors
total_errors	INT64	Sum of all errors
created_at	TIMESTAMP	When this row was written

Table 4: `daily_funnel`

Corresponds to "New User Funnel Metrics" from the dashboard.

Column	Type	Description
report_date	DATE	Date this metric was recorded (partition key)
period_label	STRING	Human-readable period
period_start_date	DATE	Start of the period
period_end_date	DATE	End of the period
accounts_created	INT64	New accounts created in period
accounts_with_tests	INT64	New accounts that took a test
conversion_rate	FLOAT64	accounts_with_tests / accounts_created * 100
created_at	TIMESTAMP	When this row was written

Implementation Steps

Phase 1: BigQuery Setup (eli-devops)

Create BigQuery dataset kpi_metrics in eli-health-prod
Grant access to basedash-reader service account
Create tables with partitioning on report_date

Terraform resources to add:

# In eli-devops/terraform/bigquery.tf (or new file)

resource "google_bigquery_dataset" "kpi_metrics" {
  dataset_id  = "kpi_metrics"
  project     = "eli-health-prod"
  location    = "US"
  description = "Daily aggregated KPI metrics for Basedash"

  access {
    role          = "READER"
    user_by_email = "basedash-reader@eli-health-prod.iam.gserviceaccount.com"
  }
}

resource "google_bigquery_table" "daily_engagement" {
  dataset_id = google_bigquery_dataset.kpi_metrics.dataset_id
  table_id   = "daily_engagement"
  project    = "eli-health-prod"

  time_partitioning {
    type  = "DAY"
    field = "report_date"
  }

  schema = file("${path.module}/schemas/daily_engagement.json")
}

# Similar for daily_advanced_metrics, daily_errors, daily_funnel

Phase 2: eli-kpi Code Changes

2.1 Create new file: `src/bigquery-exporter.ts`

import { BigQuery } from '@google-cloud/bigquery';
import { KpiData, ErrorData, AdvancedMetricsData } from './types.js';

const BIGQUERY_PROJECT_ID = 'eli-health-prod';
const KPI_METRICS_DATASET = 'kpi_metrics';

interface ExportResult {
  table: string;
  rowsInserted: number;
  success: boolean;
  error?: string;
}

export async function exportDailyEngagement(
  reportDate: string,
  kpiData: KpiData[]
): Promise<ExportResult> {
  const bigquery = new BigQuery({ projectId: BIGQUERY_PROJECT_ID });
  const table = bigquery.dataset(KPI_METRICS_DATASET).table('daily_engagement');

  const rows = kpiData.map(row => ({
    report_date: reportDate,
    period_label: row.period,
    period_start_date: row.period_start_date,
    period_end_date: row.period_end_date,
    all_tests: row.all_measurements,
    successful_tests: row.successful_measurements,
    successful_test_users: row.successful_measurement_users,
    active_users: row.active_users,
    total_users_to_date: row.total_users_to_date,
    total_active_users_to_date: row.total_engaged_users_to_date,
    active_percentage: row.engaged_active_percentage,
    active_change: row.engaged_active_change,
    avg_tests_per_active_user: row.average_tests_per_active_user,
    created_at: new Date().toISOString()
  }));

  try {
    await table.insert(rows);
    return { table: 'daily_engagement', rowsInserted: rows.length, success: true };
  } catch (error) {
    return { table: 'daily_engagement', rowsInserted: 0, success: false, error: String(error) };
  }
}

export async function exportDailyAdvancedMetrics(
  reportDate: string,
  advancedData: AdvancedMetricsData[]
): Promise<ExportResult> {
  // Similar implementation for advanced metrics
}

export async function exportDailyErrors(
  reportDate: string,
  errorData: ErrorData[]
): Promise<ExportResult> {
  // Similar implementation for error data
}

export async function exportDailyFunnel(
  reportDate: string,
  kpiData: KpiData[]
): Promise<ExportResult> {
  // Calculate conversion rate and export
}

export async function exportAllKpiMetrics(
  reportDate: string,
  kpiData: KpiData[],
  errorData: ErrorData[],
  advancedData?: AdvancedMetricsData[]
): Promise<ExportResult[]> {
  const results: ExportResult[] = [];

  results.push(await exportDailyEngagement(reportDate, kpiData));
  results.push(await exportDailyErrors(reportDate, errorData));
  results.push(await exportDailyFunnel(reportDate, kpiData));

  if (advancedData) {
    results.push(await exportDailyAdvancedMetrics(reportDate, advancedData));
  }

  return results;
}

2.2 Add API endpoint in `src/server.ts`

// POST /api/export-kpi - Export daily KPIs to BigQuery for Basedash
app.post('/api/export-kpi', async (req, res) => {
  try {
    const { date } = req.body;
    const reportDate = date || new Date().toISOString().split('T')[0];

    console.log(`Exporting KPI metrics for ${reportDate}...`);

    // Fetch daily metrics (12 days of daily data)
    const kpiResult = await fetchKpiData(reportDate, 'day', 12);

    // Export to BigQuery
    const exportResults = await exportAllKpiMetrics(
      reportDate,
      kpiResult.kpiData,
      kpiResult.errorData,
      kpiResult.advancedMetricsData
    );

    const success = exportResults.every(r => r.success);
    const totalRows = exportResults.reduce((sum, r) => sum + r.rowsInserted, 0);

    res.json({
      success,
      reportDate,
      results: exportResults,
      summary: {
        tablesUpdated: exportResults.filter(r => r.success).length,
        totalRowsInserted: totalRows
      }
    });
  } catch (error) {
    console.error('Error exporting KPI metrics:', error);
    res.status(500).json({
      success: false,
      error: String(error)
    });
  }
});

2.3 Add CLI command in `src/index.ts`

program
  .command('export')
  .description('Export KPI metrics to BigQuery for Basedash')
  .option('-d, --date <date>', 'Report date (YYYY-MM-DD)', new Date().toISOString().split('T')[0])
  .action(async (options) => {
    const { date } = options;
    console.log(`Exporting KPI metrics for ${date}...`);

    const kpiResult = await fetchKpiData(date, 'day', 12);
    const results = await exportAllKpiMetrics(
      date,
      kpiResult.kpiData,
      kpiResult.errorData,
      kpiResult.advancedMetricsData
    );

    console.log('Export results:', results);
  });

Phase 3: Cloud Scheduler Setup (eli-devops)

Add to Terraform:

resource "google_cloud_scheduler_job" "kpi_daily_export" {
  name        = "kpi-daily-export"
  description = "Export daily KPI metrics to BigQuery for Basedash"
  schedule    = "0 6 * * *"  # 6:00 AM UTC daily
  time_zone   = "UTC"
  region      = "us-east1"

  http_target {
    http_method = "POST"
    uri         = "https://kpi.app.eli.health/api/export-kpi"

    headers = {
      "Content-Type" = "application/json"
    }

    body = base64encode(jsonencode({
      date = "$${TODAY}"  # Cloud Scheduler substitution
    }))

    oidc_token {
      service_account_email = "kpi-service-us@eli-health-prod.iam.gserviceaccount.com"
    }
  }
}

Phase 4: Basedash Configuration

Connect Basedash to BigQuery using existing basedash-reader-key.json
Create dashboards from the new kpi_metrics tables
Set up auto-refresh to pull latest data

Privacy Considerations

Data exported (safe for Basedash):

Daily aggregate counts (tests, sessions, errors)
Percentages and rates
Period labels and dates

Data NOT exported:

User emails
User IDs
Individual test results
Image paths
Error messages with user context

Testing Plan

Local testing: Run npm run export -- --date 2025-12-23 to verify data export
Verify in BigQuery: Query the kpi_metrics tables to confirm data
Basedash verification: Create a test dashboard and verify data appears correctly

Rollback Plan

If issues occur:

Disable Cloud Scheduler job
Drop or truncate the kpi_metrics tables if needed
No impact on existing KPI dashboard functionality

Timeline

Step	Description
1	Create BigQuery dataset and tables (eli-devops)
2	Implement `bigquery-exporter.ts` in eli-kpi
3	Add API endpoint and CLI command
4	Deploy eli-kpi with new export functionality
5	Create Cloud Scheduler job
6	Configure Basedash dashboards
7	Backfill historical data (optional)

Files to Create/Modify

New Files:

eli-kpi/src/bigquery-exporter.ts - Export logic
eli-devops/terraform/kpi_metrics.tf - BigQuery resources
eli-devops/terraform/schemas/daily_*.json - Table schemas

Modified Files:

eli-kpi/src/server.ts - Add /api/export-kpi endpoint
eli-kpi/src/index.ts - Add export CLI command
eli-kpi/package.json - Add npm run export script

Questions to Resolve

Historical backfill: Should we backfill historical data? If so, how far back?
Retention policy: How long should we keep historical metrics in BigQuery?
Additional metrics: Are there other metrics the marketing team needs?
Alert on failure: Should we alert if the daily export fails?

Approval

Reviewed by: _______________
Approved by: _______________
Implementation start date: _______________

Overview​

Problem Statement​

Solution Architecture​

Data Flow​

BigQuery Tables Schema​

Dataset: eli-health-prod.kpi_metrics​

Table 1: daily_engagement​

Table 2: daily_advanced_metrics​

Table 3: daily_errors​

Table 4: daily_funnel​

Implementation Steps​

Phase 1: BigQuery Setup (eli-devops)​

Phase 2: eli-kpi Code Changes​

2.1 Create new file: src/bigquery-exporter.ts​

2.2 Add API endpoint in src/server.ts​

2.3 Add CLI command in src/index.ts​

Phase 3: Cloud Scheduler Setup (eli-devops)​

Phase 4: Basedash Configuration​

Privacy Considerations​

Testing Plan​

Rollback Plan​

Timeline​

Files to Create/Modify​

New Files:​

Modified Files:​

Questions to Resolve​

Approval​