Security of Infrastructure Secrets (opens in new tab)

(journal.paul.querna.org)

58 pointssmurfpandey11y ago17 comments

17 comments

Disclaimer: infrastructure secrets management is my profession.

This is a lot harder problem than people realize.

If you have a fixed set of machines that need secrets, then encrypting a bag of secrets with each machine's private key works ok.

But in auto scaling / automated / ephemeral scenarios, it doesn't work. You need an RBAC scheme for machines that builds layers of trust; each machine is placed into a role by a trusted service, script or person. Communication between the machines and the secrets service is verified TLS. Each event of access to, or modification of, a secret is recorded for audit purposes. And people and machines should both be treated as first-class actors.

Furthermore, secrets should be kept off permanent media; per the 12factor guidelines, secrets should come from environment variables.

Don't entangle secrets management with other tools like configuration management; otherwise you impede yourself from switching architectures down the road.

Don't create workflows that only ops can control, leaving developers out in the cold, or you are increasing organizational friction.

And if your secrets management processes are opaque to security and compliance people, then they won't have the same level of trust that they would have in a transparent system.

Here's an example of how we approach the problem: http://blog.conjur.net/chef-cookbook-uploads-with-conjur

csirac211y ago

Something that bothers me is (seemingly) widespread use of passphraseless ssh keys or, using ssh-agent without a timeout setting (so your keys are always loaded). I have to wonder if part of this is because ssh-agent -t starts the timeout clock for automatically unloading keys from whenever the agent was launched, rather than resetting the clock at each signing operation (which would mimic the familiar sudo behaviour).

This makes using ssh-agent with a reasonable timeout incredibly painful.

So you're left with either reentering your passphrase every 5/10/15mins, or basically never. Using smartcards for humans and TPMs for servers is a step in the right direction, but it seems ssh-agent is still missing this basic functionality - or am I missing something?

kgilpin11y ago

Most people we have worked with don't take ssh key passwords seriously, because they can be stripped out. We advocated for the idea that password-protected ssh keys are a form of 2-factor auth, but nobody bought into that.

Organizations that want 2-factor auth are typically setting up bastion / jump hosts that require a second factor like a phone-delivered one-time password. This can be configured through the PAM stack.

Once on the bastion, the user can get to other machines within the accessible network using their passwordless ssh key. In effect, each bastion serves as a mini-perimeter.

And yes, people spend a lot of time entering their second factor. Dozens of times per day is not unusual.

Re-reading your question, I'm not really answering it. But maybe this anecdote is useful in some way :-)

1 more reply

jessaustin11y ago

I stored the user’s Chef private key that I downloaded from the Hosted Chef UI as a Conjur variable.

Wouldn't it be better to generate the key in the same place it will be used? Transferring private keys over the network smells bad to me. Is there some requirement for a user to have only one key pair active at a time? If so that is bad. Each "client" environment you use should be able to upload a public key whenever it's convenient.

dustinrcollins11y ago

When using Hosted Chef you can't generate a private key and upload it. You create a user, their system generates your key pair and displays the private key one-time-only for you to store somewhere. A user in Chef can only have one keypair at a time. This is just a limitation of their system we have to work with.

It's important to note that the 'user' here in Hosted Chef is not a person, it is an identity in the Chef server that is allowed to upload cookbooks. Its scope is limited to only that.

Rotating the deploy user's key when using HostedChef is a 1 step process, using knife and Conjur together

``` knife user reregister "conjurbot" | conjur variable values add hostedchef/conjurbot/private_key ```

The stdout of `knife user reregister` is the private key so you can update the variable in Conjur without even seeing the value. You could run this in a cron job if you wanted. Your CI system responsible for uploading cookbooks will pull the new private key next time it runs.

Again, not ideal that Hosted Chef only allows you one keypair per user but we can minimize the threat by rotating the key frequently.

gaadd3311y ago

Has Conjur been audited by a third party or is the source open at all? Otherwise we just have to trust that the thing that we store all of our secrets in is secure right?

dustinrcollins11y ago

We have been audited by a 3rd party and incorporated all of their suggestions in the latest 4.4 release.

http://blog.conjur.net/conjur-4-4-released

One of their stipulations for the audit was that we don't use it for promotional purposes so I guess a NDA is required to discuss details.

The tech we use for encryption of secrets is definitely open source here: https://github.com/conjurinc/slosilo

Conjur isn't built on in-house cryptographic software - it uses trusted open-source tools - OpenSSL, PAM and so on.

Most of our work is open-source https://github.com/conjurinc https://github.com/conjur-cookbooks

sjbase11y ago

Storing secrets is fundamentally imperfect ("it's not a secret if someone [or something] else knows it"). This article calls for aid in the form of standards other than PCI-DSS, and those standards do actually exist. NIST 800-53 and 800-130 to name a couple; the EU has others in different industry flavors.

Now, I'm not going to defend these govt. standards as up-to-date or comprehensive. But they're a good philosophical reference for how to manage keys/secrets. Some COTS technologies (which I won't advertise here) try to automate/enforce strong key management for infra, but are typically only affordable for enterprise deployments.

kgilpin11y ago

If secrets are rotated or time-limited, they become a lot better. For an example, see the AWS notion of Token Vending Machine.

It's much easier to feel comfortable handing out secrets of each of them had a fixed lifespan. It reduces anxiety greatly.

proksoup11y ago

The attack vectors that surprised me but should not have:

- MongoHQ support person has access to data in customer database.

- CircleCI stores everything in the MongoHQ database, that is used to deploy/control customer servers.

- CircleCI's Customers' CircleCI controlled environments mixed with production environments.

I am guessing everyone just expects most companies, especially those with maybe just Series A financing or close to it, expects those companies to employ this level of security paranoia?

jlgaddis11y ago

I think we (those of us using "cloud services") put entirely too much trust in the providers that we use.

We all just pretty much assume that they're doing the right thing(TM) with regard to security even after we've seen, time and again, that this is certainly not the case.

gohrt11y ago

> everyone just expects most companies ... to employ this level of security paranoia?

The established enterprise hosting companies have security-infrastructure teams that are larger than the entire staff of most startups. Draw your own conclusions about how thorough those startups are with regard to security.

jffry11y ago

Needs a [2013] tag. I thought they had been compromised again.

cddotdotslash11y ago

The link to the MongoHQ page about their compromise is giving a cert expired error (1 day ago), which then redirects to a page not found on compose.io. Odd.

j / k navigate · click thread line to collapse

17 comments

kgilpin11y ago

Disclaimer: infrastructure secrets management is my profession.

This is a lot harder problem than people realize.

If you have a fixed set of machines that need secrets, then encrypting a bag of secrets with each machine's private key works ok.

Furthermore, secrets should be kept off permanent media; per the 12factor guidelines, secrets should come from environment variables.

Don't entangle secrets management with other tools like configuration management; otherwise you impede yourself from switching architectures down the road.

Don't create workflows that only ops can control, leaving developers out in the cold, or you are increasing organizational friction.

And if your secrets management processes are opaque to security and compliance people, then they won't have the same level of trust that they would have in a transparent system.

Here's an example of how we approach the problem: http://blog.conjur.net/chef-cookbook-uploads-with-conjur

csirac211y ago

This makes using ssh-agent with a reasonable timeout incredibly painful.

kgilpin11y ago

Once on the bastion, the user can get to other machines within the accessible network using their passwordless ssh key. In effect, each bastion serves as a mini-perimeter.

And yes, people spend a lot of time entering their second factor. Dozens of times per day is not unusual.

Re-reading your question, I'm not really answering it. But maybe this anecdote is useful in some way :-)

1 more reply

jessaustin11y ago

I stored the user’s Chef private key that I downloaded from the Hosted Chef UI as a Conjur variable.

dustinrcollins11y ago

It's important to note that the 'user' here in Hosted Chef is not a person, it is an identity in the Chef server that is allowed to upload cookbooks. Its scope is limited to only that.

Rotating the deploy user's key when using HostedChef is a 1 step process, using knife and Conjur together

``` knife user reregister "conjurbot" | conjur variable values add hostedchef/conjurbot/private_key ```

Again, not ideal that Hosted Chef only allows you one keypair per user but we can minimize the threat by rotating the key frequently.

gaadd3311y ago

Has Conjur been audited by a third party or is the source open at all? Otherwise we just have to trust that the thing that we store all of our secrets in is secure right?

dustinrcollins11y ago

We have been audited by a 3rd party and incorporated all of their suggestions in the latest 4.4 release.

http://blog.conjur.net/conjur-4-4-released

One of their stipulations for the audit was that we don't use it for promotional purposes so I guess a NDA is required to discuss details.

The tech we use for encryption of secrets is definitely open source here: https://github.com/conjurinc/slosilo

Conjur isn't built on in-house cryptographic software - it uses trusted open-source tools - OpenSSL, PAM and so on.

Most of our work is open-source https://github.com/conjurinc https://github.com/conjur-cookbooks

sjbase11y ago

kgilpin11y ago

If secrets are rotated or time-limited, they become a lot better. For an example, see the AWS notion of Token Vending Machine.

It's much easier to feel comfortable handing out secrets of each of them had a fixed lifespan. It reduces anxiety greatly.

proksoup11y ago

The attack vectors that surprised me but should not have:

- MongoHQ support person has access to data in customer database.

- CircleCI stores everything in the MongoHQ database, that is used to deploy/control customer servers.

- CircleCI's Customers' CircleCI controlled environments mixed with production environments.

I am guessing everyone just expects most companies, especially those with maybe just Series A financing or close to it, expects those companies to employ this level of security paranoia?

jlgaddis11y ago

I think we (those of us using "cloud services") put entirely too much trust in the providers that we use.

We all just pretty much assume that they're doing the right thing(TM) with regard to security even after we've seen, time and again, that this is certainly not the case.

gohrt11y ago

> everyone just expects most companies ... to employ this level of security paranoia?

jffry11y ago

Needs a [2013] tag. I thought they had been compromised again.

cddotdotslash11y ago

The link to the MongoHQ page about their compromise is giving a cert expired error (1 day ago), which then redirects to a page not found on compose.io. Odd.

j / k navigate · click thread line to collapse