AWS Infrastructure Memo
Date: 2026-05-14
Scope: AWS footprint backing rovn.to, passport.rovn.to, app.rovn.to, and passport.rovn.to/mcp.
Posture: Pre-launch · live core rails · BAA executed at the AWS account level · SOC 2 Type II in progressSOC 2 status06.3 SOC 2 Type II Plan · auditor selected, controls in implementation (Drata).
1. Region and BAA posture
- Region:
us-east-2(Oregon). HIPAA-eligible. (Perdata/deploy-receipts.json.) - Why us-east-2AWS region07.2 AWS Infrastructure Memo · single-region ECS / RDS / S3 in us-east-2: HIPAA-eligible region with full Bedrock model availability for the AI chain; intentional separation from us-east-1 outage correlation; cost delta vs cheapest US region is < 5%.
- BAA: AWS Business Associate Addendum executed at the AWS Organization level. All PHI-handling services in the footprint are inside AWS's published HIPAA-eligible list. The BAA matrix below names every service we actually invoke.
- PHI residency rule: PHI never leaves
us-east-2. Cloudflare carries only marketing surfaces. The AI chain (AWS BedrockAI provider chain07.3 AI Architecture · AWS Bedrock under BAA → Anthropic Claude Haiku 4.5 under BAA → Rōvn ECS under BAA → Anthropic Claude (Haiku 4.5)Model identity07.3 AI Architecture · Haiku 4.5 chosen for cost + latency + BAA chain under BAA → Rōvn backend on ECS) keeps Claude model traffic inside the AWS BAA boundary under Zero Data Retention (seeAI_ARCHITECTURE_MEMO.md).
2. BAA-covered services we invoke
| Service | Use | Status |
|---|---|---|
| ECS Fargate | FastAPI app containers (Passport, facility workflow layer, Verified API, MCP) | LIVE |
| RDS for PostgreSQL | Primary OLTP database, 75+ migrations applied | LIVE |
| S3 | Audit chain bucket (Object Lock), source-receipt bucket, PHI document bucket | LIVE |
| KMS | Customer-managed keys for RDS, S3, Secrets Manager | LIVE |
| Secrets Manager | All credentials, API keys, JWT signing keys | LIVE |
| Cognito | Worker authentication (migrations 073, 075) | LIVE |
| ALB (Application Load Balancer) | Front of ECS service · TLS termination | LIVE |
| CloudWatch (logs + metrics + alarms) | Observability | LIVE |
| CloudFront | Static asset CDN in front of S3 | LIVE |
| WAF | OWASP Core Rule Set in front of ALB | LIVE |
| Lambda | Scheduled re-verification cadence triggers, source-adapter fan-out | LIVE (limited) |
| SES | Worker / hospital transactional email | LIVE |
| SNS | Internal fan-out for source-adapter notifications | LIVE |
| AWS HealthLake | FHIR R4 store for payer enrollment data exchange | TARGET (Y2) |
| Bedrock | Anthropic Claude (Haiku 4.5)Model identity07.3 AI Architecture · Haiku 4.5 chosen for cost + latency + BAA chain executor path under BAA, production AI chain | LIVE |
Every service above is in AWS's published HIPAA-eligible list. We have not turned on services that are not BAA-covered for PHI workloads, this is enforced by an account-level Service Control Policy.
3. Compute
Internet
│
┌───────▼────────┐
│ Cloudflare │
│ (marketing, │
│ DDoS, bot) │
└───────┬────────┘
│
┌──────▼──────┐
│ AWS WAF │ ← OWASP CRS
│ (managed) │
└──────┬──────┘
│
┌──────▼──────┐
│ ALB │ ← TLS 1.3
│ (multi-AZ) │
└──────┬──────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌──────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
│ ECS task A │ │ ECS task B │ │ ECS task C │
│ AZ us- │ │ AZ us- │ │ AZ us- │
│ east-2a │ │ east-2b │ │ east-2c │
└────────────┘ └────────────┘ └────────────┘
- ECS Fargate running FastAPI containers built via CodeBuild → ECR (see
DEPLOYMENT_OVERVIEW.md). - Task definition parameterized: CPU 1024, memory 2048 baseline; auto-scaled on ALB request count + CPU utilization.
- Minimum task count: 2 across AZs. Max: 12. Auto-scaling targets 60% CPU steady-state.
- Active task definition family:
rovn-passport-api. As of 2026-05-27 we are on revision 288 (imageprod-202605270526-ai-competitive-fix). - Public surfaces served by the same fleet:
passport.rovn.to,passport.rovn.to/mcp,app.rovn.to, plus internal API routes under/api/*. - OpenAPI docs: disabled in production (
app/main.pysetsdocs_url=Nonewhenenvironment == "production").
4. Database
- Engine: PostgreSQL on RDS, single-region, multi-AZ.
- Schema posture: 75+ shipped migrations through
075_worker_authkit.sql, plus2026_04_14_audit_log_harden.sqlhotfix. Forward-only. Alembic-style. - PHI columns: encrypted at rest via
pgcryptoextension. KMS customer-managed key. Examples: SSN-tail fragments, DOB, document content snippets. - Backups: AWS RDS automated daily snapshots, 30-day retention. Point-in-time recovery (PITR) enabled.
- Read replicas: PARTIAL, single primary today. Read-replica plan documented in
RUNBOOK.md; trigger is sustained read RPS, not buyer-driven. - Connection pool: PgBouncer fronts the cluster; per-task pool kept narrow to avoid connection storms on cold start.
- Sensitive-query review: all PHI-touching queries reviewed; raw SQL gated behind ORM models. Query logs scrubbed before export.
5. Storage
Three distinct S3 buckets, three distinct policies:
| Bucket purpose | Encryption | Object Lock | Retention | Notes |
|---|---|---|---|---|
| Audit chain | KMS (CMK) | COMPLIANCE mode | 7 years | Hash-chained rows mirrored object-by-object; even AWS root cannot delete during retention |
| Source receipts | KMS (CMK) | Governance mode | 7 years | Per-source verification artifacts (state BON HTML, NPDB JSON, etc.); migrations 062, 068 |
| PHI documents | KMS (CMK) | None | Per record-class policy | Worker-uploaded license images, ID docs; expires per expirables_reminder_log cadence |
S3 bucket policies block public access at the account level (S3 Block Public Access ON). VPC endpoints are used so that ECS-to-S3 traffic does not transit the public internet.
6. Networking
- VPC: single VPC, multi-AZ, private + public subnets.
- Public subnets: ALB, NAT Gateway.
- Private subnets: ECS tasks, RDS, ElastiCache (future).
- NACL + Security Group posture: least-privilege. RDS only reachable from ECS task SGs. Secrets Manager via VPC endpoint.
- TLS: TLS 1.3 at ALB. Forward secrecy enforced via cipher suite policy
ELBSecurityPolicy-TLS13-1-2-2021-06(no TLS 1.0 / 1.1). - WAF: AWS Managed Rules, Core Rule Set + Known Bad Inputs + SQL Injection rule group. Per-IP rate-limit at 2000 req / 5 min on
/api/*. - Egress: controlled via NAT Gateway. Only allow-listed FQDNs reachable for source-adapter outbound (Nursys, NPDB, OIG, etc.). Route table audited quarterly.
7. Identity and access (IAM)
- Root account: unused for daily ops. MFA hardware key. Break-glass procedure documented in
RUNBOOK.md. - Console access: IAM Identity Center (SSO) only. MFA enforced on every principal. Console session caps at 8 hours.
- Programmatic access: short-lived role assumption via STS. No long-lived access keys for humans.
- ECS task role: task-scoped IAM role; permissions trimmed to the exact RDS, S3 buckets, Secrets Manager paths, KMS keys, CloudWatch log groups it needs.
- Deploy role:
claude-deployIAM user is the only entity (other than break-glass humans) authorized to ECR push and ECS service update. Scoped toPowerUser + IAMFull + ArtifactSync. Rotated and audited perreference_rovn_deploy_auth.md.
8. Encryption
- At rest: KMS customer-managed keys for RDS, S3 (all three buckets), Secrets Manager. Key rotation: AWS-managed annual rotation enabled.
- In transit: TLS 1.3 at every public edge. mTLS NOT yet required between ECS and RDS (RDS in private subnet + SG isolation is the control today; mTLS to RDS is a TARGET item for SOC 2 Type 2).
- Field-level:
pgcryptofor the small set of PHI columns where row-level retrieval needs decryption inside the app.
9. Secrets management
- AWS Secrets Manager is the only authorized secret store.
- Rule: no secrets in environment variables in source. ECS task definitions reference Secrets Manager ARNs.
- Rotation: 90-day rotation policy on Anthropic, Persona, Checkr, WorkOS, Stripe, Cognito, and database master keys. Rotation events are written to the audit chain.
- Drift detection: Drata monitors for any IAM principal granted
secretsmanager:GetSecretValueoutside the allow-list.
10. Observability
- CloudWatch Logs: structured JSON. PHI-redaction filter at the log shipper (PHI fields stripped before write).
- CloudWatch Alarms: P0 alarms wired to Slack + PagerDuty. Examples: 5xx rate > 1% over 5 min, RDS CPU > 85% sustained, ECS service running tasks < desired for > 2 min.
- X-Ray: distributed trace on ECS for inbound API requests. Sampling at 10% steady-state, 100% on
/admin/*and/audit/*. - Sentry: application error tracking. Boot wiring in
app/main.py:1-9. PHI scrubber on every breadcrumb. - Health endpoint:
/healthfor ECS health checks. Synthetic prober fromus-east-2hits it every 30s to confirm cross-region reachability.
11. Backup and disaster recovery
- Multi-AZ: RDS multi-AZ failover enabled. Failover RTO observed ~90s in dry runs.
- Snapshots: automated daily RDS snapshots, 30-day retention. Manual snapshots taken before every migration deploy.
- PITR window: 35 days (RDS default for paid tier).
- Audit chain replay: S3 Object Lock COMPLIANCE-mode bucket means audit chain survives even worst-case account compromise; replay verified quarterly.
- DR runbook: documented in
RUNBOOK.md. RPO target: 15 min. RTO target: 4 hours. PARTIAL, quarterly fire drill schedule is set; first full fire drill is on the post-close roadmap. - Cross-region copy: snapshots and audit bucket replicated nightly to
us-east-2for cold standby. PARTIAL.
12. HealthLake posture (TARGET)
AWS HealthLake (FHIR R4) is in the BAA matrix and reserved for the Y2 payer-enrollment workflows where Rōvn participates in FHIR-based credentialing exchange (CAQH ProView, Availity). It is not load-bearing today, current payer-enrollment flows use the per-payer adapters (039_payer_enrollment_foundation.sql, 041_payer_ops_v2.sql, 056_payer_enrollment_extensions.sql). HealthLake light-up is the trigger for the Y2 expansion narrative in 04_data_room/06_financial/3_CASE_MODEL_SUMMARY.md.
13. What this memo does not claim
- We do not claim any HIPAA certification language, we claim HIPAA-alignedHIPAA posture06.2 HIPAA Posture Memo · canonical procurement-safe phrasing (not 'compliant' / not 'certified') posture with BAA availableBAA posture06.4 Vendor BAA Matrix · customer BAA template at 08.9, with every dependency in the AWS HIPAA-eligible service list and a signed AWS BAA.
- We do not claim SOC 2 attested, Drata evidence collection is in progress; Type 1 audit window opens 2026-Q3.
- We do not claim cross-region active-active, DR posture is multi-AZ active + cross-region cold standby.
- We do not claim mTLS to RDS today, TARGET for SOC 2 Type 2 cycle.
End of memo.