I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.
A relational database lookup against primary keys in a table with a dozen records is effectively free. Heck, load the entire collection at the start of each request - through a short lived cache if your profiling says that would help.
Once you start getting more complicated (flags enabled for specific users etc) you should consider build-vs-buy more seriously, but for the most basic version you really can have no-deploy-changes at minimal cost with minimal effort.
There are probably good open source libraries you can use here too, though I haven't gone looking for any in the last five years.
Telling them that my auth provider isn't out, but the thing I use to show them a blue button vs a red button is.
Oof.
The combination of these two has been all we've ever needed. User segmentation, A/B testing, pilot soft launch etc are all easy.
I have no idea why that is.
I have no idea why anyone would actually do that in real life. Feature flags are something so trivial that you can implement them from scratch in a few hours, tops — and that includes some management UI.
Early on at Notion we used simple percent rollout in Redis, then we built our own flag & experimentation system, but as our needs got more complex we ended up switching to a 3rd party rather than dedicating a team to keep building out the internal system.
We will probably hit a scale in a few years where it makes sense to bring this back in house but there’s certainly a sweet spot for the 3rd party version between the 50-500 engineer mark for SaaS companies.
We have a FF as a service platform and a big "value add" is that we can turn on and off features at the client level with it.
But, unfortunately, it's both not the only mechanism for this and it is also being used for actual feature flags and not just client specific configuration.
I'm personally a MUCH bigger fan of putting feature flags in a configuration file that you deploy either with the application or though some mechanism like kubernetes configs. It's faster, easier to manage, and really easy to quickly answer the question "What's turned on, why, and for how long". Because, a core part of managing feature flags is deleting them and the old code path once you are confident things are "right".
The biggest headache of our FF-ws is that's really not clear and we OFTEN end up with years old feature flags that are on with the old code path still existing even though it's unexercised.
But at high throughput, you might want something with dedicated professional love. Ten thousand feature flags, being checked at around 2 (or 200) million RPS from multiple deployments... I don't want to be the team with that as their side project. And once you're talking a team of three to six engineers to build all this out, maybe it makes sense to just buy something for half a million a year. Assuming it can actually fit your model.
But these approaches are insane for companies above a certain size, where individuals are being hired and fired regularly, security matters, and feature flags are in the critical path of revenue.
Last time I looked at LaunchDarkly Enterprise licensing, it started at $50k/year, and included SAML.
Now that sounds like a lot, but if you're well past the startup stage, you need a tiny team to manage your homegrown platform. Maybe you have other things for them to do as well, but you probably need 3 people devoting at least 25% of their time to this, in order to maintain. So that's at least $175k/year in the USA, and if your company is growing, then probably the opportunity cost is higher.
Permanent per customer configuration is not a feature flag. Also best would be not to have too many per customer configurations.
Roll your own. Seriously.
Feature flags are such an easy thing that there should be a robust and completely open source offering not tied to B2B SaaS. Until then, do it in house.
My team built a five nines feature flag system that handled 200k QPS from thousands of services, active-active, local client caching, a robust predicate DSL for matching various conditions, percent rollout, control plane, ACLs, history, everything. It was super robust and took half an engineer to maintain.
We ultimately got roped into the "build vs buy" / "anti-weirdware" crosshairs from above. Being tasked with migrating to LaunchDarkly caused more outages, more headache, and more engineering hours spent. We were submitting fixes to LaunchDarkly's code, fixing the various language client integrations, and writing our own Ruby batching and multiprocessing. And they charged us way more for the pleasure.
Huge failure of management.
I've been out of this space for some years now, but someone should "Envoy" this whole problem and be done with it. One service, optional sidecars, all the language integrations. Durable failure and recovery behavior. Solid UX. This shouldn't be something you pay for. It should be a core competency and part of your main tooling.
The middle ground is a JSON file that is copied up and periodically refreshed. We (Sentry) moved from a managed software to just a YAML file with feature flags that is pushed to all containers.
The benefit of just changing a file is that you have a lot of freedom of how you deal with it (eg: leave comments) and you have the history of who flipped it and for which reason.
No-one ever changes the bloody things and it's just an extra thing to go wrong. If it only loads on startup, it achieves nothing over a bog standard config file. If it loads every request you've just incurred a 5% overhead on every call.
And it ALWAYS ends up filled with crap that doesn't work anymore. Because unlike config files, no-one clear it up.
Worse still is when people haven't made it injectable and then it means unit tests rely on a real database, or it blocks getting a proper CI/CD pipeline working.
I end up having to pick the damn thing out of the app.
Use a config file like everyone else that's probably built into the framework you're using.
To be honest, most of the time I've seen it has been when people who clearly did not know their language/framework who wrote the app.
I'm not saying it's you, but that's been my honest experience of config in the db, it's generally been a serious code smell that the whole app will be bad.
In my experience, feature flagging is more application-level than system-level. What I mean by that is, feature flagging is for stuff like: roll this feature out to 10% of users, or to users in North America, or to users who have opted into beta features; enable this feature and report conversion metrics (aka A/B testing); enable this experimental speedup for 15 minutes so we can measure the performance increase. It's stuff that you want to change at runtime, through centralized tooling with e.g. auditing and alerting, without restarting all of your application servers. It's a bit different than config for like "what's the database host and user", stuff that you don't want to change after initialization (generally).
Regarding the article though, early on your deployment pipeline should be fast enough that updating a hardcoded JSON file and redeploying is just as easy as updating a feature flag, so I agree it's not something to invest in if you're still trying to get your first 1000 users.
Granted, not for all software. And there's something to be said about a config file that you can just replace at deployment. But that's something that varies a lot from one environment to another.
Our config files are stored in their own repo. Pushes to the master branch trigger a Jenkins job that copies the config files to a GCP bucket.
On startup, each machine pulls this config from GCS and everything just works.
It's not a 'redeployment' in the sense that we don't push new images on each config change.
Just starting with them and learning to improve your application of them is the best way to learn, too.
There is one book on feature flags that had been written earlier, some of the independently published books by experienced tech folks out there are a goldmine.
Feature Flags by Ben Nadel is one such book for me. There is an online version that is free as well. Happy to learn about others.
If you start doing it for sub-groups, hard agree but this is a space where it almost always pays dividends to roll your own first. The size of a company that needs to consider adding feature flags (versus one that already has them) is typically that in which building your own is quicker, cheaper, and most importantly: simpler.
Have people still not bought into the whole 12 factor config things?
Have seen the pattern many times:
Hard-code values in code -> configure via env -> configure slow things via env and fast things via redis -> configure almost everything via a config management system
I do not want to reboot every instance in a fleet of 2000 nodes just to enable a new feature for a new batch of beta testers. How do I express that in an env var anyways? What if I have 100s of flags I need to control?
In other cases I need some set of nodes to behave one way, and some set of nodes to behave another way - say the nodes in us-west-2 vs the nodes in eu-central-1. Do I really want to teach my deploy system the exhaustive differences of configuration between environments? No I want my orchestration and deploy layer to be as similar as possible between regions, and push almost everything besides region & environment identification into the app layer - those two can be env vars because they basically never change for the life of the cluster.
There's several aspects of deployments that are in contention with each other: safety, deployment latency, and engineering overhead are how I'd break it down. Every deployment process is a tradeoff between these factors.
What I (maybe naively) think you're advocating is writing more end-to-end tests, which moves the needle towards safety at the expense of the other factors. In particular, having end to end tests that are materially better than well-written k8s health checks (which you already have, right?) is pretty hard. They might be flakey, they might depend on a lot of specifics of the application that's subject to change, and they might just not be prioritized. In my experience, the highest value end-to-end tests are based on learned experiences of what someone already saw go wrong once. Writing comprehensive testing before the feature is even out results in many low quality tests, which is an enormous drain on productivity to write them, to maintain them, and to deal with the flakey tests. It is better, I think, to have non-comprehensive end-to-end tests that provides as much value for the lowest overhead on human resources. And the safety tradeoff we make there can be mitigated by having the feature behind a flag.
My whole thesis, really, is that by using feature flags you can make better tradeoffs between these than you otherwise could.
It's one of the two big reasons. First is the ability to rollout features gradually and separate deployments from feature release, and second is the ability to turn new features off when something goes wrong. Even part of the motivation of A/B testing is de-risking.
One, as others have called out, is the ability to control rollout (and rollback) without needing a deployment. Think mobile apps and the rollout friction. If something goes wrong, you need a way to turn off the offending functionality quickly without having to go through another deployment or a mobile app review.
Second, is to be able to understand the impact of the rollout. Feature flags can easily measure how the rollout of one feature can affect the rest of the system - whether it is usage, crash rates, engagement, or further down the funnel, revenue. It’s a cheat code for quickly turning every rollout into an experiment. And you don’t need a large sample size for catching some of these.
By having this power, you will find yourself doing more of it, which I believe is good.
If you don’t have much traffic, and can live with having to redeploy to flip the switch, then fine, stick it in a config file.
But I clicked through expecting a defence of hard coding feature flags in the source code (`if true` or `if customerEmail.endsWith(“@importantcustomer.com”)`). I very don’t approve of this.
There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.
That's actually the only sense of "feature flag" I was aware of before this discussion.
> Hard-coding that kind of feature flag makes sense. Because a later revision will delete the feature flags outright (removing the branches).
Yup. And, AFAIK, is what "feature flag" means.
> There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.
So "feature flag" has now taken on -- taken over? -- the meaning of just plain "flag" (or "switch" or "toggle" or whatever), as in ordinary everyday run-time configuration? What is this development supposed to be good for? We used to have two distinct distinguishable terms for two distinct distinguishable things; now we apparently don't any more. So we've lost a bit of precision from the language we use to discuss this stuff. Have we, in exchange, gained anything?
That was a stupid dependency.
It might be enough to test new features with a limited audience (beta build, test deployments for stakeholders/qa).
If done correctly this solution can be easily extended to use a feature flag management tool, or a config file.
PS: removing new features during build/tree-shaking/etc adds some additional security. In some cases even disabled features could pose a security risk. Disabled features are often not perfectly tested yet.
Redeploying takes time. Sometimes you want to disable something quick. Having a way to disable that feature without deploys is amazing in those cases.
That being said, there’s really no need to rely for a dedicated service for this. We use our in house crm, but we also have amplitude for more complex cases (like progressive rollout)
Context: I run a FF company (https://prefab.cloud/)
There are multiple distinct benefits to be had from feature flagging. Because it's the "normal" path, most FF products bundle them all together, but it's useful to split them out.
- The code / libraries for evaluating rules. - The UI for creating rules, targeting & roll outs. - The infrastructure for hosting the flags and providing real-time updates. - Evaluation tracking / debugging to help you verify what's happening.
If you don't need #1 and #2 there, you might decide to DIY and build it yourself, but I think you shouldn't have to. Most feature flag tools today are usable in an offline mode. For Prefab it is: https://docs.prefab.cloud/docs/how-tos/offline-mode You can just do a CLI command to download the flags. Then boot the client off a downloaded file. With our pricing model that's totally free because we're really hardly doing anything for you. Most people use this functionality for CI environments, but I think it's a reasonable way to go for some orgs. It has 100% reliability and that's tough to beat.
You can do that if you DIY too, but there's so many nice to haves in actually having a tool / UI that has put some effort into it that I would encourage people not to go down that route.
>Hardoced feature flags
Think the author obviously meant "hardcoded" here.
Anyways, recently, this has been really hard to sell teams on in my experience. At some point "feature flag" became equivalent to having an entire SaaS platform involved (even for systems where interacting with another SaaS platform makes little sense). I can't help but wonder if this problem is "caused" by the up-coming generation of developers' lived experience with everything always being "online" or having an external service for everything.
In my opinion, your feature flag "system" (at least in aggregate) needs to be layered. Almost to act as "release valves."
Some rules or practices I do:
* Environment variables (however you want to define or source them) can and should act as feature flags.
* Feature flag your feature flag systems. Use an environment variable (or other sourced metadata, even an HTTP header) to control where your program is reading from.
* The environment variables should take both take priority if they're defined AND act as a fallback in case of detected or known service disruption with more configurable feature flag systems (such as an internal DB or another SaaS platform).
* Log the hell out of feature flags, telemetry will keep things clean (how often flags are read, and how often they're changed).
* Categorize your feature flags. Is this a "behavioral" feature flag or functional (i.e., to help keep the system stable). Use whatever qualifiers make sense for your team and system.
* Remove "safety" flags for new features/releases after you have enough data to prove the release is stable.
* Remove unused "behavior" flags once a year.
My $0.02
> Anyways
I think you obviously meant "anyway".
Depending on your infra, that can already make them toggleable without a redeployment: a restart of the apps/containers on the new envvars is enough.
Having them in a separate file would be useful if you need to be able to reload the flags upon receiving SIGUSR1 or something.
That's not what I'd call hardcoding, it's a startup-time configuration option. Hardcoding is, well, "hard coding", as in changing something in the source code of the affected component, in particular with compiled languages (with interpreted languages it the distinction is a bit mushy.)
And then for compilation there is the question whether it is a build system option (with some kind of build user interface) or "actual" hardcoding buried somewhere.
Also, there is a connection to be drawn here to loadable/optional software components. Loading or not loading something can be implemented both as startup-time or runtime decision.
With that said, I think that LaunchDarkly and the like are a bit expensive and heavyweight for many orgs, and leaving too many feature flags lying around can become serious debt. It totally makes sense to start with something lighter weight, e.g. an env var or a quick homegrown feature in ActiveAdmin.
In a rails app you have an ./app/plans folder with Ruby files, each command containing a class that represents each feature in a plan.
There’s code examples at https://github.com/rubymonolith/featureomatic if you want to have a look.
I’ve used it for a few apps now and it’s been pretty useful. When you do need plug a database into it, you can do so by having the class return the values from a database.
As a solo dev, something lightweight and cost-effective like this is attractive. Deployment is a CLI command or PR merge away.
Would I recommend it for the 200 person engineering org that deploys at most once a week? Probably not.
A bit surprised people rolls their own implementation when this is easily available, and not that expensive. At least in Azure.
The primary purpose of feature flags is to provide a way to change system behavior dynamically, without needing a deploy.
That's not feature flags; it's just ordinary configuration. (Actually, seems many contributions to this discussion get those mixed up. Maybe even TFA itself.)