Internal Security Systems Overview

Document ID: ELI-SEC-001
Document Title: Internal Security Systems Overview
Version No: 1.1
Date: September 5, 2025
Classification: Confidential

1. PURPOSE

This document provides a comprehensive overview of Eli Health's data flows, system architecture, and security safeguards to demonstrate compliance with PIPEDA, Law 25, GDPR, and HIPAA (where applicable). It serves as a reference for audits, Data Protection Impact Assessments (DPIAs), and incident response planning.

The document outlines the technical and organizational measures implemented to protect user data, particularly sensitive health-related information processed through our hormone monitoring system.

2. SCOPE

This document applies to all Eli Health systems, applications, and infrastructure that process, store, or transmit user data, including sensitive health-related information.

It covers:

The Hormone Analysis Engine (HAE) and supporting software services
Mobile applications (iOS and Android) and web interfaces
APIs, databases, and third-party integrations
Internal employee and contractor access to systems containing user data
Security controls, policies, and procedures required for GDPR, PIPEDA, Law 25, and HIPAA (where applicable)
Infrastructure components hosted on Google Cloud Platform (GCP)
Development and deployment pipelines via GitHub Actions
Monitoring and incident response systems

Excluded:

Marketing websites that do not process personal health data
Third-party systems outside Eli's control (unless a Data Processing Agreement is in place)
Test data that does not contain real user information

3. TERMS AND DEFINITIONS

Term	Definition
HAE	Hormone Analysis Engine - Core system for analyzing hormone test results
PII	Personal Identifiable Information - Data that can identify an individual
PHI	Personal Health Information - Health-related data about an individual
RBAC	Role-Based Access Control - Security approach that restricts system access
DPO	Data Protection Officer - Person responsible for data protection compliance
GCP	Google Cloud Platform - Cloud infrastructure provider
Cloud Run	GCP serverless compute platform for containerized applications
Firebase	Google's mobile platform for authentication and app development
BigQuery	GCP's enterprise data warehouse for analytics
Cloud SQL	GCP's fully managed relational database service
WAF	Web Application Firewall - Security layer protecting web applications
OAuth	Open standard for access delegation
JWT	JSON Web Token - Secure method for transmitting information
TLS	Transport Layer Security - Cryptographic protocol for secure communications
IAM	Identity and Access Management - Framework for access control
VPC	Virtual Private Cloud - Isolated network environment
SIEM	Security Information and Event Management
DDoS	Distributed Denial of Service - Type of cyber attack

Document ID	Document Title
SDD	Software Design Document
PDPROJ-02-DDP	Design and Development Plan
ELI-PRIV-001	Privacy Policy
ELI-SEC-002	Incident Response Plan
ELI-SEC-003	Data Retention Policy
ELI-SEC-004	Access Control Policy
ELI-DEV-001	Secure Development Guidelines
ELI-OPS-001	Infrastructure Operations Manual

5. RESPONSIBILITY

Role	Name/Department	Responsibilities
Chief Technology Officer	Engineering	Overall system security and architecture
Data Protection Officer	Legal/Compliance	GDPR/PIPEDA/Law 25 compliance
Infrastructure Lead	DevOps	Cloud infrastructure security
Security Engineer	Security Team	Security monitoring and incident response
Backend Lead	Engineering	API and data security
Mobile Lead	Engineering	Mobile app security
QA Lead	Quality Assurance	Security testing and validation
Compliance Officer	Compliance	Regulatory compliance and audits

6. BACKGROUND

The Eli System is a hormone measurement system which allows users to monitor their hormonal state at the time of testing and to track their hormone levels over time. The system is intended to be used by persons aged 18 years and older.

System Components:

Hardware Component:
- Cartridge housing the lateral flow assay
- Lateral flow assay for hormone measurement
Software Components:
- Mobile application (iOS/Android) with camera for image capture
- Hormone Analysis Engine (HAE) for image analysis
- Backend API for data management
- Cloud infrastructure for storage and processing
- Web interfaces for data visualization (KPI Dashboard)
Data Processing:
- Image acquisition through mobile camera
- Analysis through HAE algorithms
- Storage in secure cloud database
- Display through mobile and web interfaces

Software Safety Classification:

The software safety classification is Class A as it is used for the display of results to the end user and presents no injury or damage to health. However, given the sensitive nature of health data, we implement security measures equivalent to higher-risk classifications.

7. FLOW OF INFORMATION

7.1 System Architecture Overview

7.2 Authentication Architecture

7.3 Data Flow and Security Boundaries

7.4 BigQuery Data Governance

Implementation Update (December 2024): Table-level IAM access control has been implemented via Terraform to protect biometric data while maintaining operational access for authorized users.

Access Control Implementation

User Type	Access Level	Tables Accessible	Implementation
Admin Users	Dataset-level	ALL tables including `record`	`bigquery_admin_users` in Terraform
Admin Service Accounts	Dataset-level	ALL tables including `record`	`bigquery_admin_service_accounts` in Terraform
Readonly Users	Table-level	Public tables only (NOT `record`)	`bigquery_readonly_users` + `bigquery_public_tables`

Current Admin Users (Production)

Current Readonly Users (Production)

pious@eli.health (customer support)
media@videnglobe.com (marketing team)

7.5 Data Flow Descriptions

7.5.1 User Registration Flow

User downloads mobile app from App Store (iOS only currently)
User creates account with email/password
Firebase Authentication creates user account
Backend API creates user profile in Cloud SQL
User data encrypted and stored
Welcome email sent via SendGrid

7.5.2 Hormone Test Flow

User initiates test in mobile app
Camera captures image of test cartridge
Image sent to HAE API via secure HTTPS
HAE processes image using ML algorithms
Results stored in Cloud SQL database
Results returned to mobile app
Aggregated analytics (non-biometric) synchronized to BigQuery

7.5.3 Data Access Flow

User authenticates via Firebase (customers) or OAuth (internal users)
JWT token generated with user permissions
API validates token for each request
RBAC determines data access level
Data retrieved from appropriate source
Response encrypted and sent to client

7.6 Security Layers

Layer 1: Network Security

Cloud Armor WAF: Protects against OWASP Top 10 threats
DDoS Protection: Adaptive protection against volumetric attacks
SSL/TLS: All communications encrypted with TLS 1.3
VPC Isolation: Private network segments for services

Layer 2: Application Security

Authentication: Firebase Auth for mobile, OAuth 2.0 for web
Authorization: JWT-based with role permissions
Input Validation: All inputs sanitized and validated
API Rate Limiting: Prevents abuse and ensures availability

Layer 3: Data Security

Encryption at Rest: AES-256 for all stored data
Encryption in Transit: TLS 1.3 for all communications
Key Management: Google Cloud KMS for key rotation
Data Segregation: Tenant isolation in multi-tenant architecture

Layer 4: Infrastructure Security

Container Security: Distroless images, vulnerability scanning
Secret Management: Google Secret Manager for credentials
IAM Policies: Least privilege access control
Audit Logging: Comprehensive logging to Cloud Logging

7.7 Data Classification and Handling

Data Type	Classification	Storage	Encryption	Retention
User Profile	PII	Cloud SQL	AES-256	Account lifetime
Health Data	PHI	Cloud SQL	AES-256	7 years
Test Images	PHI	Cloud Storage	AES-256	90 days
Analytics	Aggregated	BigQuery	AES-256	Indefinite
Logs	Operational	Cloud Logging	AES-256	30-90 days
Backups	All Types	Cloud Storage	AES-256	30 days

7.8 Encryption Strategy and Key Management

7.8.1 Current Architecture Analysis (September 2025)

After a comprehensive security review, we've identified critical issues with our current encryption implementation that require immediate remediation:

Current Issues:

Wrong Data Being Encrypted: The system currently encrypts PII (names, emails) rather than PHI (biometric data)
Key Storage Anti-Pattern: Encryption keys stored in the same database as encrypted data
Redundant PII Encryption: User emails already secured in Firebase, creating unnecessary complexity
Unencrypted PHI: Cortisol, progesterone, and other biometric readings stored without field-level encryption

Security Risk Assessment:

High Risk: Data encryption keys (DEKs) stored in data_encryption_key table alongside encrypted data
Medium Risk: If database is compromised, attacker gets both encrypted data and encrypted keys
Low Risk: Google KMS provides strong key encryption, but architecture weakens overall security

7.8.2 Recommended Encryption Strategy

What Should Be Encrypted (Priority Order):

Critical PHI (Must Encrypt):
- Hormone readings (cortisol, progesterone values)
- Biometric measurements (all test results)
- Health condition tags and symptoms
- Device pairing secrets
- Test images (already encrypted in Cloud Storage)
Non-Critical PII (Optional):
- User profiles managed by Firebase Authentication
- Emails and names can remain unencrypted in database for operational efficiency
- These are not PHI and don't require HIPAA-level protection

What Should NOT Be Encrypted:

User IDs and reference keys
Timestamps and metadata
Aggregated analytics data
System logs (sanitized)

7.8.3 Proper Key Management Architecture

Current (Problematic) Architecture:

Google KMS → Encrypts → DEKs (stored in same DB) → Encrypt → User Data

Recommended Architecture Option 1 - Secret Manager:

Google KMS → Encrypts → DEKs (in Secret Manager) → Encrypt → PHI Only
Firebase → Manages → User Authentication & PII
Cloud SQL → Stores → Encrypted PHI + Unencrypted operational data

Recommended Architecture Option 2 - Dedicated Key Service:

HashiCorp Vault / AWS KMS → Manages all keys
Application → Requests keys via API → Encrypts PHI
Database → Stores only encrypted PHI
Key rotation → Automated monthly

7.8.4 Next Steps

The immediate priority is to:

Stop encrypting PII fields that don't require it (names, emails)
Move encryption keys out of the database and into Google Secret Manager
Implement proper field-level encryption for all biometric data (cortisol, progesterone, etc.)
Establish a key rotation policy and procedures

7.8.5 Compliance Alignment

HIPAA Requirements (US Market):

Encryption at Rest: ✅ Already provided by Cloud SQL
Encryption in Transit: ✅ TLS 1.3 implemented
Key Management: ⚠️ Needs improvement (keys in same DB)
PHI Protection: ❌ Not currently encrypting biometric data
Access Controls: ✅ IAM and RBAC implemented

PIPEDA/Law 25 Requirements (Canadian Market):

Reasonable Safeguards: Current encryption insufficient for health data
Data Minimization: Should only encrypt what's necessary
Breach Notification: Easier with proper PHI encryption

7.8.6 Technical Implementation Details

Current Encryption (To Be Deprecated):

// Current - Encrypting wrong data
encryptedEmail = encrypt(user.email, DEK)
encryptedName = encrypt(user.firstName, DEK)
// Biometric data stored in plain text - WRONG!

Recommended Encryption:

// Recommended - Encrypt PHI only
user.email = plaintext // Operational data, secured by database encryption
user.firstName = plaintext // Not PHI
biometric.cortisolValue = encrypt(value, DEK) // PHI - must encrypt
biometric.progesteroneValue = encrypt(value, DEK) // PHI - must encrypt

Key Storage Migration:

# Current (BAD)
Database Table: data_encryption_key
├── User.firstName (key)
├── User.lastName (key)
└── User.email (key)

# Recommended (GOOD)
Google Secret Manager:
├── biometric-data-key-2025-09
├── health-tags-key-2025-09
└── device-pairing-key-2025-09

7.8.7 Incident Response Considerations

With proper PHI encryption:

Data Breach Impact: Limited to metadata, PHI remains protected
Key Compromise: Can rotate keys without data loss
Compliance Reporting: Clear delineation of protected vs. unprotected data
Recovery Time: Faster with separated key management

7.8.8 Monitoring and Auditing

Key metrics to track:

Number of failed decryption attempts (indicates key issues)
Key rotation compliance (monthly target)
PHI access patterns (unusual access = potential breach)
Encryption performance impact (less than 50ms overhead target)

7.9 Third-Party Integrations

Service	Purpose	Data Shared	Security Measures
Firebase	Authentication	Email, User ID	OAuth 2.0, encrypted
SendGrid	Email delivery	Email, Name	API key auth, TLS
Sentry	Error monitoring	Stack traces	Sanitized, no PII
Google Analytics	Usage analytics	Anonymous usage	IP anonymization
Terra API	Wearable data	Health metrics	OAuth, encrypted

7.9 Access Control Matrix

Role	Mobile App	Backend API	HAE API	Database	BigQuery	Infrastructure
End User	Full	Via App	Via App	No	No	No
Support	View Only	Read	No	Read	No	No
Developer	Full	Full	Full	Dev Only	No	Dev Only
Data Team	No	No	No	Read Only	Query Only	No
Board Members	No	No	No	No	Read Only	No
DevOps	No	Deploy	Deploy	All	Admin	Full
Admin	Full	Full	Full	Full	Full	Full

7.10 Monitoring and Alerting

Real-time Monitoring

Application Performance: Cloud Monitoring dashboards
Security Events: Cloud Security Command Center
Error Tracking: Sentry integration
Uptime Monitoring: Synthetic checks every minute

Alert Channels

Critical: PagerDuty (24/7 on-call)
High: Email to engineering team
Medium: Slack notifications
Low: Daily summary reports

Key Security Metrics

Failed authentication attempts
API rate limit violations
WAF blocked requests
Unusual data access patterns
Infrastructure changes
Certificate expiration

7.11 Incident Response

Incident Classification

P0 (Critical): Data breach, system compromise
P1 (High): Service outage, authentication failure
P2 (Medium): Performance degradation, minor security issue
P3 (Low): Non-critical bugs, documentation issues

Response Team

Incident Commander: Coordinates response
Technical Lead: Implements fixes
Communications: User and stakeholder updates
Legal/Compliance: Regulatory notifications

Response Procedures

Detection: Automated alert or user report
Triage: Assess severity and impact
Containment: Isolate affected systems
Investigation: Root cause analysis
Remediation: Fix and patch
Recovery: Restore normal operations
Post-mortem: Lessons learned

7.12 Compliance Controls

Right to access: Data export API
Right to deletion: Account deletion workflow
Data portability: JSON/CSV export formats
Consent management: Granular permissions
Privacy by design: Minimal data collection

PIPEDA Compliance

Accountability: Designated privacy officer
Consent: Clear opt-in mechanisms
Limited collection: Only necessary data
Safeguards: Technical and organizational measures
Openness: Transparent privacy policy

Law 25 (Quebec) Compliance

Privacy officer: Designated for Quebec
Impact assessments: Regular DPIAs
Incident notification: Within 72 hours
Consent for minors: Age verification
Data residency: Canadian data centers option

7.13 Security Testing

Automated Testing

SAST: Static code analysis in CI/CD
DAST: Dynamic security testing weekly
Dependency Scanning: Daily vulnerability checks
Container Scanning: Image vulnerability assessment

Manual Testing

Penetration Testing: Annual third-party assessment
Code Reviews: All PRs reviewed for security
Security Audits: Quarterly internal audits
Compliance Audits: Annual compliance review

7.14 Business Continuity

Backup Strategy

Database: Daily automated backups, 30-day retention
Code: Git repositories with multiple remotes
Infrastructure: Terraform state in GCS with versioning
Secrets: Backed up in separate project

Disaster Recovery

RTO: 4 hours for critical services
RPO: 1 hour for data loss
Failover: Automated to secondary region
Testing: Quarterly DR drills

High Availability

Multi-zone: Services deployed across zones
Auto-scaling: Based on load patterns
Load balancing: Global load balancer
Health checks: Continuous monitoring

7.15 Comprehensive Logging and Monitoring System

This section details our comprehensive logging infrastructure for tracking, debugging, and auditing all system activities across mobile and backend services.

7.15.1 Logging Architecture Overview

Core Components

Mobile Logging: Structured logging from iOS/Android applications
Backend Logging: Centralized logging from all API services
Trace ID System: End-to-end request tracking across all services
Log Aggregation: Google Cloud Logging for centralized storage
Log Analysis: Real-time analysis and alerting capabilities

Logging Strategy

Implementation Highlights:

All mobile requests are logged with detailed context
Backend services capture all API requests and responses
Unique trace IDs enable full request lifecycle tracking
Error logs include full stack traces and context
Success messages logged for audit trail
Performance metrics captured for optimization

7.15.2 Mobile Application Logging

iOS/Android Logging Configuration

Current Implementation:

Log Levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
Log Format: Structured JSON with timestamp, trace ID, user ID, event type
Local Storage: 7-day rolling buffer on device
Upload Strategy: Batched uploads every 5 minutes or on critical errors
Privacy: PII/PHI automatically redacted before upload

Mobile Log Fields:

{
  "timestamp": "2025-09-18T10:30:00Z",
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "encrypted_user_id",
  "device_info": {
    "platform": "iOS",
    "version": "17.0",
    "app_version": "1.2.3"
  },
  "event": "hormone_test_initiated",
  "metadata": {...}
}

7.15.3 Backend Service Logging

API Request/Response Logging

Comprehensive Tracking:

Request Logging: Method, endpoint, headers, body (sanitized)
Response Logging: Status code, response time, body (sanitized)
Error Logging: Full stack traces, error codes, recovery actions
Success Logging: Completion status, performance metrics

Backend Log Structure:

{
  "timestamp": "2025-09-18T10:30:00Z",
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "service": "backend-api",
  "method": "POST",
  "endpoint": "/api/v1/test-results",
  "status": 200,
  "duration_ms": 145,
  "user_id": "encrypted_user_id",
  "request_size": 2048,
  "response_size": 512
}

7.15.4 Trace ID Implementation

End-to-End Request Tracking

Trace ID Flow:

Generation: Mobile app generates unique trace ID for each user action
Propagation: Trace ID included in all API headers
Service Chaining: Backend services pass trace ID to all downstream calls
HAE Integration: Image processing includes trace ID in all logs
Database Operations: Trace ID logged with all database queries
Response Path: Same trace ID used for response logging

Benefits:

Complete request lifecycle visibility
Easy debugging of complex workflows
Performance bottleneck identification
User journey reconstruction
Incident investigation efficiency

7.15.5 Log Aggregation and Storage

Google Cloud Logging Configuration

Current Setup:

Log Router: Filters and routes logs to appropriate sinks
Log Buckets: Separate buckets for different log types
Retention Policies:
- Error logs: 90 days
- Request/response logs: 30 days
- Debug logs: 7 days
- Audit logs: 7 years
Access Control: IAM-based with audit trail

Log Sinks Configuration:

Error Logs → Critical Errors Bucket (90-day retention)
API Logs → Request/Response Bucket (30-day retention)
Audit Logs → Compliance Bucket (7-year retention)
Performance Logs → Metrics Bucket (30-day retention)

7.15.6 Log Analysis and Monitoring

Real-time Analysis

Monitoring Dashboards:

Error Rate Dashboard: Real-time error tracking by service
Performance Dashboard: API latency and throughput metrics
User Journey Dashboard: Trace ID based flow visualization
Security Dashboard: Failed auth attempts, suspicious patterns

Alert Configuration:

Error rate > 1% triggers immediate alert
Response time > 2s triggers performance alert
Failed authentication patterns trigger security alert
Service unavailability triggers critical alert

7.15.7 Log Security and Compliance

Data Protection

Security Measures:

Encryption: All logs encrypted at rest and in transit
PII Redaction: Automatic removal of sensitive data
Access Logging: All log access is audited
Retention Compliance: Automatic deletion per policy

Compliance Features:

GDPR-compliant data retention
PIPEDA audit trail requirements
HIPAA-compliant PHI handling
Law 25 transparency requirements

7.15.8 Troubleshooting with Logs

Using Trace IDs for Debugging

Step-by-Step Process:

Identify Issue: User reports problem or monitoring detects anomaly
Locate Trace ID: Find in error report or user session
Query Logs: Search all logs with specific trace ID
Analyze Flow: Review complete request lifecycle
Identify Root Cause: Pinpoint exact failure point
Resolution: Apply fix and verify through logs

Example Query:

SELECT timestamp, service, message, metadata
FROM logs
WHERE trace_id = '550e8400-e29b-41d4-a716-446655440000'
ORDER BY timestamp ASC

7.15.9 Performance Insights from Logs

Metrics Derived from Logs

API Response Times: P50, P95, P99 latencies
Error Rates: By endpoint, user segment, time period
User Patterns: Feature usage, journey completion rates
System Health: Service availability, dependency health

7.15.10 Mobile to Backend Logging Flow

Complete Logging Architecture

Key Features of Logging Flow

Mobile Side:

Async Batching: Logs batched up to 200, sent every 5 minutes or on critical errors
Offline Support: Logs cached in AsyncStorage when offline
Rate Limiting: 2-second minimum between flushes, 1-second between batches
Authentication Required: Logs only sent when user is authenticated
Automatic Retry: Failed logs added back to buffer for next flush

Trace ID Implementation:

Generation: UUID v4 format (32 hex characters) generated on mobile
Propagation: Included in all API headers as X-Trace-Id
Correlation: Same trace_id used across mobile → backend → HAE
User Context: Both user_id and username included for complete tracking
Span Support: Sub-operations tracked with span_id within same trace

Backend Processing:

Middleware Chain: TraceMiddleware → UserContextMiddleware → HttpLoggingMiddleware
Enrichment: Backend adds server-side context (timestamps, service names)
Unified Format: All logs normalized to consistent JSON structure
Error Tracking: Full stack traces with trace_id correlation

Storage & Analysis:

Centralized Storage: All logs aggregated in Google Cloud Logging
Trace-based Queries: Can query entire request lifecycle using trace_id
User Journey Tracking: Complete user action flow from mobile to completion
Performance Analysis: Request timings across all services
Error Correlation: Link errors across mobile, backend, and HAE

7.16 Database Backup and Disaster Recovery

This section details our comprehensive backup and disaster recovery strategy for all critical systems and data stores.

7.16.1 Google Cloud Platform (GCP) Infrastructure

Current Backup Configuration

Platform: Google Cloud Platform (Multi-Region)
Production Region: us-east1 (South Carolina, USA)
Development Region: northamerica-northeast1 (Montreal, Canada)
Backup Frequency: Continuous and automated
Retention Policy: Varies by service (detailed below)
Access Control: Role-based via IAM

GCP Services Backup Details:

Cloud SQL PostgreSQL (Primary Database)

Current Settings (Verified via Terraform):

Automated Backups: ENABLED
Backup Frequency: Daily at 3:00 AM UTC
Retention Period: 7 daily backups (1 week)
Point-in-Time Recovery: ENABLED
Transaction Log Retention: 7 days
Backup Location: Same region as database (us-east1 for production, northamerica-northeast1 for development)
SSL/TLS: Required for all connections
Deletion Protection: ENABLED (both Terraform and GCP Console)

Recovery Capabilities:

Point-in-time recovery to any second within the last 7 days
Full backup restoration from any of the 7 daily backups
Cross-region restoration supported
Automated failover in case of zone failure

Cloud Storage (Object Storage)

Current Configuration:

Versioning: Enabled on all production buckets
Soft Delete: 30-day retention for deleted objects
Cross-Region Replication: Configured for critical buckets
Lifecycle Policies:
- Test images: 90-day retention
- Logs: 30-day retention
- User uploads: Indefinite retention

Cloud Run Services

Deployment Strategy:

Blue-Green Deployments: Zero-downtime updates
Revision History: Last 100 revisions retained
Traffic Management: Gradual rollout capabilities
Rollback: Instant rollback to previous revisions

Infrastructure as Code (Terraform)

State Management:

Backend: Google Cloud Storage with versioning
State Locking: Enabled to prevent concurrent modifications
State Backup: Automatic versioning in GCS
Separate Environments:
- Development: eli-health-dev bucket
- Staging: eli-health-staging bucket
- Production: eli-health-prod bucket

7.16.2 GitHub

Repository Backup Strategy

Current Implementation:

Distributed Version Control: Every clone is a full backup
Multiple Remotes: Repositories mirrored across team members
Branch Protection: Main branches protected from force pushes
Commit History: Full history preserved indefinitely

Disaster Recovery:

RPO: Near-zero (last push to remote)
RTO: Minutes (clone from any team member)
Access Control: 2FA required for all contributors

Recommended Enhancements:

Implement automated daily backups to GCS
Set up GitHub repository archiving
Enable GitHub Advanced Security features (requires GitHub Enterprise or paid add-on):
- Code Scanning: Automatically detect security vulnerabilities in code
- Secret Scanning: Find accidentally committed API keys and passwords
- Dependency Review: Identify vulnerable dependencies in pull requests
- Security Alerts: Get notifications about known vulnerabilities

7.16.3 Firebase Analytics & Authentication

Firebase Analytics Export Configuration

Current Setup:

BigQuery Export: ENABLED
Export Frequency: Hourly export (overwrites previous data - no history kept)
Data Retention:
- Firebase Console: 2 months rolling window
- BigQuery: Only latest snapshot available (overwrites hourly)
Export Includes: All events, user properties, and audiences

Data Recovery:

Can only restore most recent hourly snapshot
No historical recovery available due to overwrite pattern
Firebase Console retains 2 months of data independently

Firebase Authentication Backup

Current Setup:

BigQuery Export: Hourly export (overwrites previous data)
User Data: Email, UID, metadata exported
History: No historical backups maintained
Risk: Data corruption would propagate within 1 hour

Recommended Improvements:

Implement dated snapshots (e.g., users_2025_09_18) instead of overwriting
Maintain 7-30 days of historical snapshots
Enable point-in-time recovery for user data

Access Control:

Firebase Console: OAuth-based access
BigQuery: IAM-controlled with audit logging

7.16.4 Sentry (Error Monitoring)

Data Retention Policy

Current Configuration:

Error Events: 90-day retention (Sentry Team plan)
Performance Data: 30-day retention
Attachments: 30-day retention
Issue History: Indefinite

Backup Strategy:

Critical Errors: Automatically forwarded to Slack for permanent record
Export Capability: API access for data export
GCP Synchronization: All Sentry errors synchronized to Google Cloud Logging for advanced query analysis
Dual Storage: Errors stored in both Sentry and GCP for redundancy

Advanced Analysis Features:

Errors from Sentry available in GCP Log Explorer
Can correlate with other system logs using trace IDs
BigQuery integration for complex error pattern analysis
Long-term retention in GCP beyond Sentry's 90-day limit

Disaster Recovery:

Sentry is a SaaS platform with its own DR
Full error history maintained in Google Cloud Logging
Can reconstruct complete error history from GCP logs
Dual storage ensures no data loss if either platform is unavailable

7.16.5 BigQuery (Data Warehouse)

Backup and Recovery Features

Current Implementation:

Automatic Backups: Managed by Google (7-day time travel)
Table Snapshots: Can be created for long-term retention
Dataset Copies: Scheduled copies to backup datasets
Time Travel: Query data from up to 7 days ago

Data Governance (Updated December 2024):

Table-Level Access Control: Implemented via Terraform to protect biometric data
Admin Users (Dataset-level access to ALL tables):
- chip@eli.health, iannick@eli.health, thomas@eli.health, fannie@eli.health
- kpi-service-us@eli-health-prod.iam.gserviceaccount.com
Readonly Users (Table-level access to PUBLIC tables only):
- pious@eli.health (customer support)
- media@videnglobe.com (marketing team)
- These users CANNOT access the record table (biometric data)
Audit Logging: All queries logged in Cloud Audit Logs
Data Classification:
- record table: Contains PHI/biometric data - RESTRICTED ACCESS
- Other tables: Operational data - accessible by readonly users

Security Implementation:

Access control managed via Terraform (bigquery-iam module)
Changes require code review and Terraform apply
No manual IAM changes permitted (Infrastructure as Code)
Automatic new table protection when added to restricted list

Disaster Recovery:

RPO: Near-zero (streaming inserts)
RTO: Immediate (multi-region availability)
Export Options: Scheduled exports to Cloud Storage

7.16.6 PostgreSQL (Cloud SQL)

Advanced Configuration

Security Features:

SSL/TLS: Enforced for all connections
IAM Authentication: Enabled for service accounts
Private IP: Available via VPC peering
Automated Patches: Security updates auto-applied

Current Setup:

Regional Configuration: Single zone deployment (us-east1)
Connection Pooling: Node.js connection pool via TypeORM (configured via POSTGRES_POOL_SIZE environment variable)
High Availability: Not currently configured (single zone)
Read Replicas: Not implemented
Automatic Failover: Zone failure recovery only

Database Access:

ORM: TypeORM with PostgreSQL driver (pg)
Connection Pool: Managed by TypeORM, not pgBouncer
Pool Size: Configurable via environment variable
Connection Timeout: Default TypeORM settings

Monitoring:

Cloud Monitoring: CPU, memory, disk metrics
Query Insights: Performance analysis enabled
Alert Policies: Configured for critical metrics
Error Monitoring: Database errors tracked through application logs and pushed to GCP Cloud Logging

7.16.7 Comprehensive Disaster Recovery Plan

Recovery Objectives by Service

Service	RPO (Recovery Point Objective)	RTO (Recovery Time Objective)	Backup Method
Cloud SQL PostgreSQL	1 second (PITR)	Less than 5 minutes	Automated + PITR
Cloud Storage	Near-zero	Immediate	Versioning + Replication
BigQuery	Near-zero	Immediate	Time Travel + Snapshots
GitHub	Last commit	Less than 10 minutes	Distributed VCS
Firebase Analytics	24 hours	Not applicable	BigQuery Export
Sentry	Real-time	SaaS managed	Cloud Logging backup
Cloud Run	Current revision	Less than 1 minute	Revision history
Secrets	Version-controlled	Less than 5 minutes	Secret Manager versions

Disaster Recovery Procedures

Scenario 1: Database Corruption or Deletion

Immediate Response:
- Stop write operations to prevent further corruption
- Assess the extent of data loss
Recovery Steps:
- For recent corruption (less than 7 days): Use Point-in-Time Recovery
- For older issues: Restore from daily backup
- Validate data integrity post-restoration
Post-Recovery:
- Run data consistency checks
- Update audit logs
- Conduct post-mortem analysis

Scenario 2: Regional Outage

Detection: Automated monitoring alerts
Failover Process:
- Cloud SQL: Automatic failover to standby
- Cloud Run: Traffic routing to healthy region
- Storage: Access via multi-region configuration
Communication: Update status page and notify users

Scenario 3: Security Breach

Containment:
- Revoke compromised credentials immediately
- Enable emergency access controls
Assessment:
- Review audit logs for unauthorized access
- Identify affected data and systems
Recovery:
- Restore from known-good backups
- Rotate all credentials and keys
- Implement additional security measures

Testing and Validation

Quarterly DR Drills:

Full database restoration test
Regional failover simulation
Security incident response exercise
Communication protocol validation

Monthly Validation:

Backup integrity checks
Recovery procedure documentation review
Access control audits
Monitoring system tests

Key Personnel and Responsibilities

Role	Primary Responsibility	Backup Personnel
Incident Commander	Coordinate DR response	CTO / VP Engineering
Database Admin	Database restoration	Senior Backend Engineer
Infrastructure Lead	Service failover	DevOps Engineer
Security Lead	Security assessment	Security Engineer
Communications	User/stakeholder updates	Product Manager

Recovery Runbooks

Detailed runbooks are maintained in the private ops repository for:

PostgreSQL point-in-time recovery
BigQuery dataset restoration
Cloud Run service rollback
Secret rotation procedures
GitHub repository recovery

7.16.8 Continuous Improvement

Regular Reviews

Quarterly: DR plan review and updates
Semi-Annual: Full DR simulation
Annual: Third-party DR audit

Metrics Tracking

Backup success rate (target: 99.9%)
Recovery test success rate (target: 100%)
Mean time to recovery (target: less than RTO)
Data integrity validation (target: 100%)

8. APPENDICES

Appendix A: Security Checklist

Development Phase

Secure coding guidelines followed
Input validation implemented
Authentication/authorization checked
Sensitive data identified and protected
Security tests written
Code review completed

Deployment Phase

Operations Phase

Appendix B: Contact Information

Role	Contact	Escalation
Security Team	security@eli.health	24/7
Privacy Officer	privacy@eli.health	Business hours
Infrastructure	devops@eli.health	On-call
Compliance	compliance@eli.health	Business hours
Legal	legal@eli.health	Business hours

Appendix C: Tool Reference

Tool	Purpose	Access
GCP Console	Infrastructure management	IAM controlled
Firebase Console	Authentication management	Admin only
Sentry	Error monitoring	Developer access
PagerDuty	Incident management	On-call team
GitHub	Code repository	Team access
Terraform	Infrastructure as code	DevOps only

Appendix D: Regulatory References

GDPR: Regulation (EU) 2016/679
PIPEDA: Personal Information Protection and Electronic Documents Act
Law 25: Quebec Bill 64, Act to modernize legislative provisions
HIPAA: Health Insurance Portability and Accountability Act (US)
ISO 27001: Information security management systems
SOC 2: Service Organization Control 2

Document Control

Version	Date	Author	Changes
1.0	2025-08-18	Engineering Team	Initial version
1.1	2025-09-05	Engineering Team	Added Section 7.8 - Encryption Strategy and Key Management: Identified critical issues with current encryption, documented proper PHI vs PII encryption approach, proposed key management architecture improvements, and defined immediate next steps for remediation
1.2	2025-09-16	Engineering Team	Added Section 7.16 - Database Backup and Disaster Recovery: Comprehensive documentation of backup strategies, disaster recovery procedures, and retention policies for all critical systems including GCP, GitHub, Firebase Analytics, Sentry, BigQuery, and PostgreSQL
1.3	2025-09-18	Engineering Team	Added Section 7.15 - Comprehensive Logging and Monitoring System: Detailed documentation of mobile and backend logging infrastructure, trace ID implementation for end-to-end request tracking, log aggregation, analysis capabilities, and troubleshooting procedures. Removed future enhancements section.
1.4	2025-09-18	Engineering Team	Updated Firebase Analytics/Auth, Sentry, and PostgreSQL sections with current implementation details. Added comprehensive Mermaid diagram showing complete mobile-to-backend logging flow with trace ID correlation. Clarified PostgreSQL uses TypeORM pooling (not pgBouncer) and single-zone deployment (not HA).
1.5	2025-12-01	Engineering Team	BigQuery Table-Level Access Control: Implemented Terraform-managed IAM for biometric data protection. Updated Section 7.4 (BigQuery Data Governance) with new access control diagram showing admin vs readonly user access tiers. Updated Section 7.16.5 with detailed access control implementation. The `record` table containing biometric/PHI data is now restricted to admin users only; readonly users can access all other tables.
1.6	2025-12-02	Engineering Team	Corrected BigQuery Data Replication Description: Fixed diagrams in Sections 7.1 and 7.3 to accurately reflect that biometric data IS replicated to BigQuery but protected via role-based access control (not excluded from replication). Changed "NO Biometric Data" to "Biometric Data Protected". Updated readonly user roles: pious@eli.health is customer support (not board member), media@videnglobe.com is marketing team (not external analyst).

Next Review Date: March 2026 Document Owner: Chief Technology Officer Classification: Confidential - Internal Use Only

This document contains confidential and proprietary information of Eli Health Inc. Unauthorized distribution or disclosure is strictly prohibited.

1. PURPOSE​

2. SCOPE​

It covers:​

Excluded:​

3. TERMS AND DEFINITIONS​

4. RELATED DOCUMENTS​

5. RESPONSIBILITY​

6. BACKGROUND​

System Components:​

Software Safety Classification:​

7. FLOW OF INFORMATION​

7.1 System Architecture Overview​

7.2 Authentication Architecture​

7.3 Data Flow and Security Boundaries​

7.4 BigQuery Data Governance​

Access Control Implementation​

Current Admin Users (Production)​

Current Readonly Users (Production)​

7.5 Data Flow Descriptions​

7.5.1 User Registration Flow​

7.5.2 Hormone Test Flow​

7.5.3 Data Access Flow​

7.6 Security Layers​

Layer 1: Network Security​

Layer 2: Application Security​

Layer 3: Data Security​

Layer 4: Infrastructure Security​

7.7 Data Classification and Handling​

7.8 Encryption Strategy and Key Management​

7.8.1 Current Architecture Analysis (September 2025)​

Current Issues:​

Security Risk Assessment:​

7.8.2 Recommended Encryption Strategy​

What Should Be Encrypted (Priority Order):​

What Should NOT Be Encrypted:​

7.8.3 Proper Key Management Architecture​

Current (Problematic) Architecture:​

Recommended Architecture Option 1 - Secret Manager:​

Recommended Architecture Option 2 - Dedicated Key Service:​

7.8.4 Next Steps​

7.8.5 Compliance Alignment​

HIPAA Requirements (US Market):​

PIPEDA/Law 25 Requirements (Canadian Market):​

7.8.6 Technical Implementation Details​

Current Encryption (To Be Deprecated):​

Recommended Encryption:​

Key Storage Migration:​

7.8.7 Incident Response Considerations​

7.8.8 Monitoring and Auditing​

7.9 Third-Party Integrations​

7.9 Access Control Matrix​

7.10 Monitoring and Alerting​

Real-time Monitoring​

Alert Channels​

Key Security Metrics​

7.11 Incident Response​

Incident Classification​

Response Team​

Response Procedures​

7.12 Compliance Controls​

GDPR Compliance​

PIPEDA Compliance​

Law 25 (Quebec) Compliance​

7.13 Security Testing​

Automated Testing​

Manual Testing​

7.14 Business Continuity​

Backup Strategy​

Disaster Recovery​

High Availability​

7.15 Comprehensive Logging and Monitoring System​

7.15.1 Logging Architecture Overview​

Core Components​

Logging Strategy​

7.15.2 Mobile Application Logging​

iOS/Android Logging Configuration​

7.15.3 Backend Service Logging​

API Request/Response Logging​

7.15.4 Trace ID Implementation​

End-to-End Request Tracking​

1. PURPOSE

2. SCOPE

It covers:

Excluded:

3. TERMS AND DEFINITIONS

4. RELATED DOCUMENTS

5. RESPONSIBILITY

6. BACKGROUND

System Components:

Software Safety Classification:

7. FLOW OF INFORMATION

7.1 System Architecture Overview

7.2 Authentication Architecture

7.3 Data Flow and Security Boundaries

7.4 BigQuery Data Governance

Access Control Implementation

Current Admin Users (Production)

Current Readonly Users (Production)

7.5 Data Flow Descriptions

7.5.1 User Registration Flow

7.5.2 Hormone Test Flow

7.5.3 Data Access Flow

7.6 Security Layers

Layer 1: Network Security

Layer 2: Application Security

Layer 3: Data Security

Layer 4: Infrastructure Security

7.7 Data Classification and Handling

7.8 Encryption Strategy and Key Management

7.8.1 Current Architecture Analysis (September 2025)

Current Issues:

Security Risk Assessment:

7.8.2 Recommended Encryption Strategy

What Should Be Encrypted (Priority Order):

What Should NOT Be Encrypted:

7.8.3 Proper Key Management Architecture

Current (Problematic) Architecture:

Recommended Architecture Option 1 - Secret Manager:

Recommended Architecture Option 2 - Dedicated Key Service:

7.8.4 Next Steps

7.8.5 Compliance Alignment

HIPAA Requirements (US Market):

PIPEDA/Law 25 Requirements (Canadian Market):

7.8.6 Technical Implementation Details

Current Encryption (To Be Deprecated):

Recommended Encryption:

Key Storage Migration:

7.8.7 Incident Response Considerations

7.8.8 Monitoring and Auditing

7.9 Third-Party Integrations

7.9 Access Control Matrix

7.10 Monitoring and Alerting

Real-time Monitoring

Alert Channels

Key Security Metrics

7.11 Incident Response

Incident Classification

Response Team

Response Procedures

7.12 Compliance Controls

GDPR Compliance

PIPEDA Compliance

Law 25 (Quebec) Compliance

7.13 Security Testing

Automated Testing

Manual Testing

7.14 Business Continuity

Backup Strategy

Disaster Recovery

High Availability

7.15 Comprehensive Logging and Monitoring System

7.15.1 Logging Architecture Overview

Core Components

Logging Strategy

7.15.2 Mobile Application Logging

iOS/Android Logging Configuration

7.15.3 Backend Service Logging

API Request/Response Logging

7.15.4 Trace ID Implementation

End-to-End Request Tracking