Problem
A sales-account-planning prototype had been built by a small team of AWS engineers in a hackathon and then drifted into the hands of real account teams who liked it. By the time I picked it up, it had real users, real data and zero of the production guarantees you need to keep that situation safe.
Specifically: a single environment, SSO patched together, no audit trail of who edited which customer's plan, deploys done from a laptop, no documented restore process.
The asks were clear: get this into production properly without losing the team that uses it daily, and add the features they actually need.
Process
I treated it as a refactor with a deadline, not a rewrite:
- Multi-environment first. Split the deployment into dev, staging and prod with a CDK stack per environment, gated by per-stage approval. Stopped all laptop deploys.
- SSO done properly. Replaced the patched auth with Cognito federation against the corporate identity provider. Roles and group-based permissions modelled in the app, not the IDP.
- Audit trail. Every plan edit emitted a structured event into a DynamoDB stream, archived to S3 for retention. The UI gained a "history" panel so account managers could see who changed what.
- UX for the hot paths. Worked with three power users to identify the workflows they hit daily. Reduced clicks on the most common path from 9 to 3, fixed the table virtualisation that made large accounts slow, added keyboard shortcuts for the muscle-memory users.
Through every change I kept the existing users informed and shipped behind feature flags so the prototype kept working until each piece was ready.
Outcome
The tool moved into a properly-staged production deployment with SSO, audit, and the operational runbooks the on-call team needs. UX improvements led to noticeable increases in active usage from account teams. The prototype is no longer "prototype that became real" — it's a maintained internal product.
For engineersTechnical Deep DiveExpand
Stack
React + TypeScript (Vite) → S3 + CloudFront
│
▼
API Gateway (HTTP API) + Lambda
│
├─ DynamoDB (single table: PLANS, ACCOUNTS, USERS, AUDIT)
├─ DynamoDB Streams → Lambda → S3 audit archive
└─ Cognito (federated IdP) for auth
Single-table DynamoDB
Five entity types lived in one table:
| Entity | PK | SK | GSI1PK |
|---|---|---|---|
| Plan | PLAN#<id> | PLAN#<id> | ACCOUNT#<accountId> |
| Account | ACCOUNT#<id> | ACCOUNT#<id> | — |
| User | USER#<id> | USER#<id> | EMAIL#<email> |
| AuditEntry | PLAN#<id> | AUDIT#<timestamp>#<id> | USER#<userId> |
| Comment | PLAN#<id> | COMMENT#<timestamp>#<id> | — |
Single-table design isn't dogma — it was the right call here because every read pattern was already keyed by either the plan or the account, and we didn't need ad-hoc analytics queries (those went to the S3 export).
CDK structure
One stack per environment, instantiated by a cdk.context.json per stage. Constructs were composed:
new SalesPlanningStack(app, `sales-planning-${stage}`, { stage, env: ENV_CONFIG[stage], ssoConfig: SSO_CONFIG[stage], retention: stage === 'prod' ? RetentionDays.ONE_YEAR : RetentionDays.ONE_MONTH, });
CDK over Terraform here because the rest of the AWS internal tooling ecosystem was on CDK and the team's JavaScript skills made stack-level edits trivial.
Audit pipeline
DynamoDB Streams captured every write. A small Lambda transformed the stream record into a structured audit event (who, what changed, before/after) and pushed to a separate audit DynamoDB table partitioned by plan, plus an S3 archive for compliance retention.
The history-panel UI queried the audit table directly, scoped to the plan being viewed.
UX work
The biggest single win was killing a modal-on-modal-on-modal pattern that the prototype had inherited from a UI experiment. Users were drowning in nested dialogs to edit a plan. Replacing it with an inline edit-in-place pattern reduced the most common workflow (update an account's status and add a note) from 9 clicks across 3 modals to 3 clicks inline.
Trade-offs
- No major rewrite. Tempting to start fresh with the production design we wanted. Wrong call here — the existing users would have lost their tool while we rebuilt. Refactor in place, ship behind flags.
- Federated SSO over per-app users. More effort up front, but the alternative was a separate identity store that would have failed compliance review immediately.
