Twenty-plus brand sites, three environments per brand, one team. The only way that works is if the infrastructure looks the same for every brand and the deploy story is identical across them. This is how that's built on AWS.
The problem
Brand work in a regulated industry has a particular shape: every site needs a staging environment for the team, a UAT environment for the client, and production behind a public domain. Each environment needs predictable URLs, repeatable deploys, and rollback that doesn't require a phone call to an engineer at midnight.
Before the current topology, environments drifted between brands. Some had staging on subdomains, some on subpaths. Some used CloudFront, some didn't. Deploys were a manual mix of S3 syncs and CloudFront invalidations. Rolling back a bad release meant either re-deploying the previous build manually or hoping the cache TTL would expire before the client noticed.
I needed a single deploy story that worked for every brand without per-project special-casing. One pattern. Twenty brands. Three environments each.
The architecture
Every brand sits behind the same AWS pattern: Route 53 for DNS, CloudFront for delivery and edge caching, S3 for storage, and Lambda@Edge for any edge logic (redirects, A/B routing, header rewrites). Gatsby builds emit a folder of static assets; deploys are S3 syncs followed by targeted CloudFront invalidations.
Environments
Each brand has three environments mapped to three subdomain conventions: staging.brand.com for engineering, uat.brand.com for client review, and brand.com for production. Each environment is its own S3 bucket fronted by its own CloudFront distribution. They share zero state. Promoting a build is just deploying the same artifact to the next environment's bucket.
Deploys
A deploy is an aws s3 sync of the build artifact into the target environment's bucket followed by a targeted CloudFront invalidation. No active-version flag to flip, no symlink dance, no in-place rewrites. CloudFront serves cached content during the upload, then returns the new content on the next request after invalidation, so there's no maintenance window and no half-mixed state visible to a viewer.
Rollback
Rollback is the same flow run against an older build. Build artifacts are kept in CI, indexed by commit, so "roll back to last Tuesday's release" is the same sync + invalidate against the previous artifact. The rollback path is the deploy path. That symmetry is the single most important property of the system: any engineer who has done a deploy has already done a rollback.
Key decisions
Static over server-rendered
Brand sites don't need request-time logic. Content updates happen on a schedule, not per-visitor. Gatsby builds emit static HTML that CloudFront caches for free at the edge. The result is sites that survive traffic spikes without scaling, cost almost nothing to run, and have a tiny security surface area.
One pattern, every brand
The temptation with per-brand work is to optimise each one. I chose the opposite: every brand gets the same topology even if a particular brand could technically use less. The cost is some over-provisioning. The benefit is that any engineer can deploy any brand without project-specific knowledge, and a new brand goes from zero to three environments in hours, not days.
AWS API over a hosted CI deploy plugin
Plenty of CI tools have one-click S3+CloudFront integrations. I built the deploy logic directly against the AWS API because the team needed visibility into what was happening at each step (which S3 prefix, which invalidation, which distribution) and the canned integrations hide that. When something goes wrong in production, the team can read the logs.
Outcomes
The topology has been quietly running for years. New brands onboard onto the same pattern, the same deploy story works for every site, and rollback is genuinely boring. The best infrastructure is the kind that doesn't ask for attention. This asks for very little.
What I'd do differently
The deploy logic should be Terraformed or CDK-managed end-to-end. Right now the per-brand AWS resources were created over time, some via console, some via API. Bringing them all under infrastructure-as-code would make new-brand onboarding faster and disaster recovery easier. This is a known debt; the work to clean it up has been deferred because the system runs reliably.
I'd also add automated visual regression to the deploy flow. A bad build can technically deploy clean (no errors) but render wrong in production. A screenshot-diff step between staging and the proposed prod artifact would catch the small class of bugs that pass code review but fail the human eye.