KPI Metrics Export Pipeline
Overview
This document outlines the implementation plan to export daily aggregated KPI metrics to BigQuery, making them available for external consumption (BI tools, partners, researchers) without exposing user-level data.
Use cases:
- Basedash dashboards for marketing team
- Anonymized data for external partners
- Research datasets
- Any reporting tool that can connect to BigQuery
Problem Statement
- The KPI dashboard generates aggregate metrics on-demand but doesn't persist them
- External tools (Basedash, Looker, etc.) can only connect to databases, not APIs
- Partners and researchers need access to anonymized, aggregated data
- User privacy must be maintained - only aggregate numbers, no PII
Solution Architecture
Data Flow
- Trigger: Cloud Scheduler calls
/api/export-kpiendpoint daily at 6:00 AM UTC - Compute: eli-kpi fetches metrics from source tables (already implemented in
data-fetcher.ts) - Persist: Write aggregated metrics to new BigQuery tables in
kpi_metricsdataset - Access: Basedash reads from these tables using existing
basedash-reader-key.json
BigQuery Tables Schema
Dataset: eli-health-prod.kpi_metrics
All tables will have a common structure with report_date as the partition key.
Table 1: daily_engagement
Corresponds to "Overall Engagement Metrics" from the dashboard.
| Column | Type | Description |
|---|---|---|
| report_date | DATE | Date this metric was recorded (partition key) |
| period_label | STRING | Human-readable period (e.g., "Today (Dec 23)") |
| period_start_date | DATE | Start of the period |
| period_end_date | DATE | End of the period |
| all_tests | INT64 | Total tests (all_measurements) |
| successful_tests | INT64 | Successful tests (successful_measurements) |
| successful_test_users | INT64 | Users with successful tests |
| active_users | INT64 | Users who took at least one test in period |
| total_users_to_date | INT64 | Cumulative user count |
| total_active_users_to_date | INT64 | Cumulative users who have ever tested |
| active_percentage | FLOAT64 | Active Users / Total Active Users To Date |
| active_change | FLOAT64 | Change from previous period (nullable) |
| avg_tests_per_active_user | FLOAT64 | successful_tests / active_users |
| created_at | TIMESTAMP | When this row was written |
Table 2: daily_advanced_metrics
Corresponds to "Advanced Metrics" from the dashboard.
| Column | Type | Description |
|---|---|---|
| report_date | DATE | Date this metric was recorded (partition key) |
| period_label | STRING | Human-readable period |
| period_start_date | DATE | Start of the period |
| period_end_date | DATE | End of the period |
| total_sessions | INT64 | Total measurement sessions |
| successful_sessions | INT64 | Sessions with at least one success |
| adjusted_success_rate | FLOAT64 | % of sessions that succeeded |
| first_try_successes | INT64 | Sessions successful on first attempt |
| first_try_success_rate | FLOAT64 | % first try success |
| second_try_successes | INT64 | Sessions successful on second attempt |
| second_try_success_rate | FLOAT64 | % second try success |
| third_plus_try_successes | INT64 | Sessions successful on 3+ attempts |
| third_plus_try_success_rate | FLOAT64 | % third+ try success |
| error_gini_coefficient | FLOAT64 | Error distribution inequality |
| created_at | TIMESTAMP | When this row was written |
Table 3: daily_errors
Corresponds to "Error Data" from the dashboard.
| Column | Type | Description |
|---|---|---|
| report_date | DATE | Date this metric was recorded (partition key) |
| period_label | STRING | Human-readable period |
| period_start_date | DATE | Start of the period |
| period_end_date | DATE | End of the period |
| no_period_history | INT64 | Count of "no period history" errors |
| camera_too_far | INT64 | Count of "camera too far" errors |
| camera_too_close | INT64 | Count of "camera too close" errors |
| analysis_error | INT64 | Count of analysis errors |
| test_lines_not_found | INT64 | Count of "test lines not found" errors |
| test_not_found | INT64 | Count of "test not found" errors |
| total_errors | INT64 | Sum of all errors |
| created_at | TIMESTAMP | When this row was written |
Table 4: daily_funnel
Corresponds to "New User Funnel Metrics" from the dashboard.
| Column | Type | Description |
|---|---|---|
| report_date | DATE | Date this metric was recorded (partition key) |
| period_label | STRING | Human-readable period |
| period_start_date | DATE | Start of the period |
| period_end_date | DATE | End of the period |
| accounts_created | INT64 | New accounts created in period |
| accounts_with_tests | INT64 | New accounts that took a test |
| conversion_rate | FLOAT64 | accounts_with_tests / accounts_created * 100 |
| created_at | TIMESTAMP | When this row was written |
Implementation Steps
Phase 1: BigQuery Setup (eli-devops)
- Create BigQuery dataset
kpi_metricsin eli-health-prod - Grant access to
basedash-readerservice account - Create tables with partitioning on
report_date
Terraform resources to add:
# In eli-devops/terraform/bigquery.tf (or new file)
resource "google_bigquery_dataset" "kpi_metrics" {
dataset_id = "kpi_metrics"
project = "eli-health-prod"
location = "US"
description = "Daily aggregated KPI metrics for Basedash"
access {
role = "READER"
user_by_email = "basedash-reader@eli-health-prod.iam.gserviceaccount.com"
}
}
resource "google_bigquery_table" "daily_engagement" {
dataset_id = google_bigquery_dataset.kpi_metrics.dataset_id
table_id = "daily_engagement"
project = "eli-health-prod"
time_partitioning {
type = "DAY"
field = "report_date"
}
schema = file("${path.module}/schemas/daily_engagement.json")
}
# Similar for daily_advanced_metrics, daily_errors, daily_funnel
Phase 2: eli-kpi Code Changes
2.1 Create new file: src/bigquery-exporter.ts
import { BigQuery } from '@google-cloud/bigquery';
import { KpiData, ErrorData, AdvancedMetricsData } from './types.js';
const BIGQUERY_PROJECT_ID = 'eli-health-prod';
const KPI_METRICS_DATASET = 'kpi_metrics';
interface ExportResult {
table: string;
rowsInserted: number;
success: boolean;
error?: string;
}
export async function exportDailyEngagement(
reportDate: string,
kpiData: KpiData[]
): Promise<ExportResult> {
const bigquery = new BigQuery({ projectId: BIGQUERY_PROJECT_ID });
const table = bigquery.dataset(KPI_METRICS_DATASET).table('daily_engagement');
const rows = kpiData.map(row => ({
report_date: reportDate,
period_label: row.period,
period_start_date: row.period_start_date,
period_end_date: row.period_end_date,
all_tests: row.all_measurements,
successful_tests: row.successful_measurements,
successful_test_users: row.successful_measurement_users,
active_users: row.active_users,
total_users_to_date: row.total_users_to_date,
total_active_users_to_date: row.total_engaged_users_to_date,
active_percentage: row.engaged_active_percentage,
active_change: row.engaged_active_change,
avg_tests_per_active_user: row.average_tests_per_active_user,
created_at: new Date().toISOString()
}));
try {
await table.insert(rows);
return { table: 'daily_engagement', rowsInserted: rows.length, success: true };
} catch (error) {
return { table: 'daily_engagement', rowsInserted: 0, success: false, error: String(error) };
}
}
export async function exportDailyAdvancedMetrics(
reportDate: string,
advancedData: AdvancedMetricsData[]
): Promise<ExportResult> {
// Similar implementation for advanced metrics
}
export async function exportDailyErrors(
reportDate: string,
errorData: ErrorData[]
): Promise<ExportResult> {
// Similar implementation for error data
}
export async function exportDailyFunnel(
reportDate: string,
kpiData: KpiData[]
): Promise<ExportResult> {
// Calculate conversion rate and export
}
export async function exportAllKpiMetrics(
reportDate: string,
kpiData: KpiData[],
errorData: ErrorData[],
advancedData?: AdvancedMetricsData[]
): Promise<ExportResult[]> {
const results: ExportResult[] = [];
results.push(await exportDailyEngagement(reportDate, kpiData));
results.push(await exportDailyErrors(reportDate, errorData));
results.push(await exportDailyFunnel(reportDate, kpiData));
if (advancedData) {
results.push(await exportDailyAdvancedMetrics(reportDate, advancedData));
}
return results;
}
2.2 Add API endpoint in src/server.ts
// POST /api/export-kpi - Export daily KPIs to BigQuery for Basedash
app.post('/api/export-kpi', async (req, res) => {
try {
const { date } = req.body;
const reportDate = date || new Date().toISOString().split('T')[0];
console.log(`Exporting KPI metrics for ${reportDate}...`);
// Fetch daily metrics (12 days of daily data)
const kpiResult = await fetchKpiData(reportDate, 'day', 12);
// Export to BigQuery
const exportResults = await exportAllKpiMetrics(
reportDate,
kpiResult.kpiData,
kpiResult.errorData,
kpiResult.advancedMetricsData
);
const success = exportResults.every(r => r.success);
const totalRows = exportResults.reduce((sum, r) => sum + r.rowsInserted, 0);
res.json({
success,
reportDate,
results: exportResults,
summary: {
tablesUpdated: exportResults.filter(r => r.success).length,
totalRowsInserted: totalRows
}
});
} catch (error) {
console.error('Error exporting KPI metrics:', error);
res.status(500).json({
success: false,
error: String(error)
});
}
});
2.3 Add CLI command in src/index.ts
program
.command('export')
.description('Export KPI metrics to BigQuery for Basedash')
.option('-d, --date <date>', 'Report date (YYYY-MM-DD)', new Date().toISOString().split('T')[0])
.action(async (options) => {
const { date } = options;
console.log(`Exporting KPI metrics for ${date}...`);
const kpiResult = await fetchKpiData(date, 'day', 12);
const results = await exportAllKpiMetrics(
date,
kpiResult.kpiData,
kpiResult.errorData,
kpiResult.advancedMetricsData
);
console.log('Export results:', results);
});
Phase 3: Cloud Scheduler Setup (eli-devops)
Add to Terraform:
resource "google_cloud_scheduler_job" "kpi_daily_export" {
name = "kpi-daily-export"
description = "Export daily KPI metrics to BigQuery for Basedash"
schedule = "0 6 * * *" # 6:00 AM UTC daily
time_zone = "UTC"
region = "us-east1"
http_target {
http_method = "POST"
uri = "https://kpi.app.eli.health/api/export-kpi"
headers = {
"Content-Type" = "application/json"
}
body = base64encode(jsonencode({
date = "$${TODAY}" # Cloud Scheduler substitution
}))
oidc_token {
service_account_email = "kpi-service-us@eli-health-prod.iam.gserviceaccount.com"
}
}
}
Phase 4: Basedash Configuration
- Connect Basedash to BigQuery using existing
basedash-reader-key.json - Create dashboards from the new
kpi_metricstables - Set up auto-refresh to pull latest data
Privacy Considerations
Data exported (safe for Basedash):
- Daily aggregate counts (tests, sessions, errors)
- Percentages and rates
- Period labels and dates
Data NOT exported:
- User emails
- User IDs
- Individual test results
- Image paths
- Error messages with user context
Testing Plan
- Local testing: Run
npm run export -- --date 2025-12-23to verify data export - Verify in BigQuery: Query the
kpi_metricstables to confirm data - Basedash verification: Create a test dashboard and verify data appears correctly
Rollback Plan
If issues occur:
- Disable Cloud Scheduler job
- Drop or truncate the
kpi_metricstables if needed - No impact on existing KPI dashboard functionality
Timeline
| Step | Description |
|---|---|
| 1 | Create BigQuery dataset and tables (eli-devops) |
| 2 | Implement bigquery-exporter.ts in eli-kpi |
| 3 | Add API endpoint and CLI command |
| 4 | Deploy eli-kpi with new export functionality |
| 5 | Create Cloud Scheduler job |
| 6 | Configure Basedash dashboards |
| 7 | Backfill historical data (optional) |
Files to Create/Modify
New Files:
eli-kpi/src/bigquery-exporter.ts- Export logiceli-devops/terraform/kpi_metrics.tf- BigQuery resourceseli-devops/terraform/schemas/daily_*.json- Table schemas
Modified Files:
eli-kpi/src/server.ts- Add/api/export-kpiendpointeli-kpi/src/index.ts- AddexportCLI commandeli-kpi/package.json- Addnpm run exportscript
Questions to Resolve
- Historical backfill: Should we backfill historical data? If so, how far back?
- Retention policy: How long should we keep historical metrics in BigQuery?
- Additional metrics: Are there other metrics the marketing team needs?
- Alert on failure: Should we alert if the daily export fails?
Approval
- Reviewed by: _______________
- Approved by: _______________
- Implementation start date: _______________