Living Dangerously with Crypto in the Assimilation Project – How Many Keys?

This article is now obsolete.  It has been replaced by a newer post.

This article has been kept for historical purposes to document some of the bad ideas I went though to get to better ones 😉

 

It outlines our original (not very good) thoughts on keys and key management given our unique problems in a pragmatic and effective way.  Although we will use crypto libraries with well-proven algorithms, we will use them in slightly unconventional ways.  So, get your crypto buddies, grab a beverage (adult or otherwise), put on your thinking cap, and think hard about how we’re planning on approaching these challenges.  Although I’ve tried to think all this through, I’m not a crypto expert – which is why I’m asking for your help.

The first article in this series described some of the unique secure communication challenges that the Assimilation project faces – and provides background to help understand why we are headed the way we are.

How Many Keys?

One of the first questions to ask is – how many keys do we plan on maintaining?  For 100K servers, there seem to be three possible answers to this question.

  • 100K+1 Keys – that is, a unique key for each nanoprobe, and a public/private key pair for the CMA.  Note that the CMA key must be a public key pair. Otherwise compromise of any nanoprobe will result in immediate compromise of every system in the environment – an enterprise-wide root exploit.
  • 2 Keys – one public/private key pair for the CMA, and a common shared key for all the nanoprobes
  • 1 Key – simply a single public/private key pair for the CMA – nanoprobes do not have keys

Let’s examine the pros and cons each of these possibilities in turn…

100K+1 Keys

Pros

It is impossible for one nanoprobe to impersonate another.  Nanoprobes are safe from receiving commands from other nanoprobes or random bad guys on the network.  Commands from the CMA to the nanoprobes are safe from being observed.  Results from the nanoprobes to the CMA are confidential.

Cons

Extremely high overhead, very difficult to set up correctly. Practical complexities of cloning one machine from another, revoking keys and restoring from backups are likely to make this a nightmare in practice – not to mention difficult to code correctly in the first place.  Site-wide installation processes have to be changed and coordinated with the Assimilation software so that a unique key is generated and installed for each machine, and this same key is registered with the Assimilation infrastructure.  The more dynamic the environment, the more likely this is to be troublesome – for example in a cloud environment.

Comments

Each machine has to have a unique key generated for it when the nanoprobe software is first installed. Both keys on a machine can be installed with the same mechanisms that are used for installing the nanoprobe software. Because of the supervisory role that the CMA plays in this environment, there is no need for nanoprobe keys to be public keys.  A shared key is sufficient – and computationally faster.

2 Keys

Pros

Relatively simple to set up initially.  Revoking the shared nanoprobe key is messy, but less difficult to manage as much as the 100K+1 case. Nanoprobes are safe from receiving commands from other nanoprobes or random bad guys on the network.  Manageable on an ongoing basis with modest effort.  Not too difficult to code in the first place.  Results from the nanoprobes to the CMA are confidential.

Cons

Anyone with root privileges on any machine can masquerade as a nanoprobe – not just at install time.  Anyone with root privileges on any machine is able to decrypt messages to any nanoprobe.  The common nanoprobe key is likely to be subject to frequent compromise – particularly in multi-tenant environments.   In a multi-tenant environment, the key is more likely be compromised without the knowledge of the central management staff.

Comments

Because of the supervisory role that the CMA plays in this environment, there is no need for the common nanoprobe key to be a public key.  A shared key is sufficient and computationally faster.  Both keys can be installed with the same mechanism as is used for installing the nanoprobe software.

1 Key

Simple to set up initially.   Nanoprobes are safe from receiving commands from other nanoprobes or random bad guys on the network.  Manageable on an ongoing basis with modest effort.  Simplest to code in the first place. There’s only one key to manage – so updating it when compromised is simpler.  With proper precautions in managing the CMA, compromises to the CMA secret key are much less likely than a shared nanoprobe key.  Results from the nanoprobes to the CMA are confidential.

Cons

Anyone able to forge packets on any machine can masquerade as a nanoprobe.  Commands to the nanoprobes are sent in the clear.

Comments

The CMA public key can be installed on each machine with the same mechanism used for installing the nanoprobe software.

My Current Thinking

The “2 key” solution seems the weakest of the three.  If any machine suffers a root compromise, then all machines are compromised.  In a multi-tenant environment, it comes more to resemble security theater than real security because of the multiple organizations involved and the traditional unwillingness of one organization to tell another they’ve been compromised.  The amount of additional security it provides over the “1 Key” solution seems minimal.

The 100K+1 solution seems like a huge undertaking on its own – and will require significant testing and maturation time.  Although it has advantages over the “1 Key” solution, it is complex, and will be difficult to get right.  Perhaps if the project had people dedicated to developing it (volunteers are always accepted!), it would be possible to create a smooth and usable “100K+1” solution.  Even so, this isn’t a foregone conclusion – because of the complexity of integrating with local installation procedures.

This leaves the “1 Key” solution as my favorite for an initial implementation.

More On “1-Key” Disadvantages

The advantages of the “1 Key” solution are obvious – so let’s look at the disadvantages in more detail.

CMA Commands Sent In The Clear

The CMA commands are well-known, and quite predictable.  Sniffing requests to monitor certain applications will reveal that the machine has that application on it.  If all packets sent by the CMA can be sniffed, over time this could lead to knowing what applications are on which machines.  Because these commands are only sent out when a machine reboots or a new service starts, obtaining this information will be very slow.  Information about what versions of software or what security settings the machine has cannot be determined this way.  This problem can be mitigated by taking special care and precautions with the CMA and its switches.  This is highly desirable in any case, as the CMA is a very high-value target for any attacker.

Anyone Can Masquerade As A Nanoprobe

This means that the data in the CMA database can be compromised by an attacker.  Although the value to an attacker of doing this is not obvious here are a few possible motivations.

  • Hide the fact that a machine has been compromised.  If they have obtained root privileges on a machine, they can compromise the nanoprobe on that machine much more directly than this.  If they are doing it from another machine in the infrastructure, the resulting anomalies risks raising an alert and bring human attention to the problem.  Most attackers would avoid doing this.
  • Denial of service – wreaking havoc in IT management.  There are lots of ways to cause denial of service.  Since the data in the database is all discovered, triggering rediscovery after eliminating the threat will correct the problem.  However, it could be a distraction to cover other activity.
  • Increase the chance of administrators taking inappropriate actions.  Not much you can say about this.

How Could “1 Key” Possibly Work?

At first thought, having a single public/private key pair somehow doesn’t seem like enough.  Let’s see how I envision it working

  • Commands sent by the CMA are signed with the CMA’s private key
  • Commands received by nanoprobes are validated using the CMA’s public key
  • Results sent from the nanoprobes to the CMA are encrypted using the CMA’s public key
  • Results received by the CMA from nanoprobes are decrypted using the CMA’s private key

This seems to do what I expect it to do – and what I assumed above.

Follow On

Originally I thought I could cover all the interesting decisions concerning our encryption strategy in two articles.  I was mistaken.    There’s just too much to discuss.  So, there will definitely be at least one more article on this subject.  Please check out the next chapter in “Living Dangerously with Crypto in the Assimilation Project”.  As always, these will be announced on our blog and on @OSSAlanR.

Having said all this, and revealed my foolish thoughts for all to see, I’m looking forward to your comments on this – and particularly what I’ve overlooked.  Your comments in the blog would be very much appreciated.   If you have an aversion to comment forms – email alanr@unix.sh and I’ll incorporate your thoughts – I even have a GPG key [717A640E], or join the Assimilation development mailing list here.

Please note: I reserve the right to delete comments that are offensive or off-topic.

Leave a Reply

You have to agree to the comment policy.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

4 thoughts on “Living Dangerously with Crypto in the Assimilation Project – How Many Keys?

  1. On the point about signing the CMA commands with the CMA’s private key, so that they can be authenticated with the public key, why stop with that? Why not encrypt the commands with the private key? That would deny an adversary the plain-text commands. Of course, since all the nanoprobes would have the public key, it might not be hard for an adversary to get it, but in that case upon detection you would need to establish a new key pair anyway.

  2. Charles – you make a good point. If an attacker has an attack that lets him or her _own_ a router, but doesn’t yet have an admin (root) level attack on any servers, then encrypting the messages would make sniffing them useless. On the one hand, it raises the bar a little, and on the other hand, it normally takes a long time of listening to CMA commands to hear anything particularly interesting.

    As a downside, encrypting packets with known content makes breaking the crypto easier. I don’t know if signing them has a similar effect or not. On the other hand, I suspect breaking this strength crypto is beyond the capability of most attackers. Does anyone else have a thought or knowledge on this?

    When we encrypt, we could always add a field with random contents to data going back and forth. They would be ignored by the nanoprobes and the CMA (both sides already ignore things they aren’t expecting). Does anyone think this is worthwhile?

    Charles: Thanks for your thoughtful comments! I’ll incorporate them into the blog post.

    • Something else I learned – the libsodium encryption code incorporates a cryptographic nonce to make it harder to use a known-plaintext attack on the cipher. So, it turns out that libsodium already takes care of that for you.

  3. I got this comment by email from someone involved with security – so I’ll share it with you. My replies to his comments are in [square brackets].

    The statement that encrypted data not compressing as well as encrypting compressed data. I would argue that properly encrypted data should not compress at all (otherwise, it comments poorly on its entropy), and may actually increase in size due to the encryption overhead. I’d strengthen the statement. [Thanks! I strengthened it]

    Typo: “DTLs”. [Fixed]

    Bias: I generally favor the higher degree of security afforded by the 100K+1 keys choice, and would argue that effort to make that work effectively would be well spent. Modularity is important as always, and is likely to go far in helping that effort. [I certainly understand that it’s better. The difficulty isn’t the computer science of it, it’s the “making it no harder to use than 1-key” that’s hard. It’s really more of a resource constraint in the short term]. The converse is to build in mechanisms to at least detect, and then deal with compromised machines – that sounds more difficult to do properly. Key revocation sounds tricky or at least a bit unsure. I need to put more thought into this part.