The Boring Pattern That Made Our Deploys 10x Safer (It's Feature Flags. But Hear Me Out.)

Yeah, I know. Feature flags. Groundbreaking.

Stick with me for a minute, because this isn’t a post about what feature flags are. If you’ve been in software for more than fifteen minutes, you know what they are. This is about a specific way of using them that fundamentally changed how we think about shipping, and it’s something we’ve started bringing into every client engagement because the results are too good to keep to ourselves.

The thing that changed

We were working with a team that had a deployment problem. Not the sexy kind. No fire, no outage, no dramatic incident. The quiet kind. Every deploy was a small gamble. The test suite passed, staging looked fine, but production has a way of surprising you. The team deployed during low-traffic windows. They had a rollback plan. They watched dashboards nervously for an hour after every push.

Nothing was broken. Everything was just slightly anxious.

The team had feature flags. They used them the way most teams do: to hide unfinished features from users. A flag goes up when the feature starts development, comes down when the feature launches. Pretty standard.

What we proposed was different: decouple every deploy from every release. Not just for big features. For everything. Bug fixes, refactors, copy changes, config updates. Everything goes out behind a flag, everything gets activated separately from the deploy, and everything can be deactivated in seconds without a rollback.

The team’s initial reaction was reasonable: “that’s a lot of flags.” It is. But the trade-off is worth examining.

What deploy/release decoupling actually looks like

Here’s the workflow. A developer finishes a change. The change is wrapped in a flag that defaults to off. The PR gets reviewed and merged. It deploys to production with the next deploy cycle. At this point, nothing has changed for any user. The code is in production, but it’s inert.

When the team is ready to release, they turn the flag on. Typically for a small percentage of users first, maybe 5%, maybe internal users only. They watch the metrics. If something’s off, they turn the flag back off. No deploy. No rollback. No waiting for CI/CD. Just a toggle. The code is still there, still deployed, doing nothing.

If the metrics look good, they ramp up. 25%. 50%. 100%. The feature is now fully released. They leave the flag in place for a week in case something surfaces at scale, then they clean it up. Remove the flag, remove the conditional, ship the cleanup.

This is not new. The large tech companies have been doing this for years. What’s underappreciated is how transformative it is for smaller teams that aren’t operating at FAANG scale but are still dealing with deployment anxiety.

The three things that surprised us

Deploys stopped being events. When a deploy can’t break anything because all new code paths are behind inactive flags, the deploy itself becomes boring. The team went from deploying once a week during a designated window to deploying multiple times a day, whenever code was ready. Nobody watched dashboards. Nobody held their breath. The deploy was just code moving to a server. The release, the moment users are affected, happened separately, intentionally, with a plan.

The psychological shift was bigger than the technical one. Engineers stopped associating “deploy” with “risk.” Velocity increased not because anything got faster, but because the friction and anxiety disappeared.

Incidents became non-events. The team had a bad release about three weeks after adopting this pattern. A new search algorithm that performed well in testing caused timeout issues for a specific class of queries in production. Under the old model, this would have been a rollback: revert the PR, run the pipeline, wait for the deploy, confirm the fix, write the post-mortem.

Under the new model, someone turned off the flag. Total time to resolution: forty-five seconds. The search went back to the previous algorithm. Users experienced maybe ninety seconds of degraded search. The team spent the next two days fixing the issue in development, re-activated the flag for internal users, confirmed the fix, and ramped it back up to production over a day.

Same bug, same code, completely different incident experience. No emergency. No late night. No rollback dance.

Product decisions got better. This one was unexpected. When releasing is a gradual ramp instead of a binary launch, you naturally start treating every release as a small experiment. The team started looking at metrics during the ramp-up phase, not just “is it broken?” but “is it better?” They killed two features during ramp-up that weren’t broken but also weren’t improving anything. Under the old model, those features would have launched fully and lingered for months before anyone questioned whether they were worth keeping.

Feature flags didn’t just make deploys safer. They created a culture of evidence-based releasing where the default question wasn’t “is this ready to ship?” but “is this making things better for users?”

The cost is real, and it’s manageable

The obvious objection is flag debt. More flags means more conditional code paths, more complexity, more things to manage. This is true and it has to be addressed.

The pattern that works: flags have owners and flags have expiration dates. When a flag is created, it gets an owner (the person or team responsible for the change) and a cleanup date (typically two weeks after full ramp-up). The cleanup date goes on the team’s board like any other work item. It’s not glamorous. It’s housekeeping. But it’s the difference between a manageable system and a codebase riddled with dead conditional branches that nobody is sure they can remove.

We also distinguish between release flags (temporary, tied to a specific change, cleaned up after release) and operational flags (permanent, used for things like graceful degradation or A/B testing). Release flags are the majority. They come and go. Operational flags are rarer and get the same treatment as any other long-lived infrastructure: documented, reviewed, maintained.

The overhead is real. It’s also dramatically less than the overhead of deployment anxiety, rollback procedures, incident response, and the velocity tax of a team that’s scared to ship.

Try it on one thing

If you’re skeptical (reasonable), try it on one feature. Pick something that’s coming up in your roadmap. Wrap it in a flag. Deploy it inactive. Ramp it up deliberately. See what the experience feels like compared to your normal deploy cycle.

If it doesn’t change anything, you’ve lost a few hours. If it changes how your team thinks about shipping, you’ve gained something that compounds forever.

We write about the patterns and tools we actually use. This is one of them. More at steadfastdigital.io/articles.