Crypto background for the Assimilation project

Since its inception, the open source Assimilation project has been concerned with security, and paranoid at every opportunity. Like a lot of software, it has serious security concerns simply because of what it’s designed to do.  On the one hand, our nanoprobes run on every server in the enterprise and exercise root privileges – creating a potentially dangerous attack surface.  On the other hand, we incrementally create a high-value database which has fine-grained and up-to-date information about everything in the environment – software versions, ports, services, IP and MAC addresses, known security vulnerabilities – a veritable treasure map for an attacker.    This article details why cryptography is essential for communication in this environment, and some unique aspects of the problem we’re solving that affect how we use it.  It is our hope our readers (this means you!) will give us a thorough flogging review of how we’re using cryptography in our software – in this article and the next.

Although we’ve known about this since the beginning and designed in security from the beginning, only recently have we had the opportunity to begin filling in those portions of the architecture with real code.

The primary reason an attacker would want the discovery data that we collect is that they can make every attack on other systems succeed first time.  In addition, they can tell which machines in the infrastructure are likely to be the places that have the data they are interested in.  They know which machines are database servers – and what kind, and which are web servers, which are running custom in-house software, which are development machines, which are running security scans and so on.

Being able to subvert the command protocol from the CMA to the nanoprobes could potentially result in obtaining root permissions on any machine they wanted to attack – potentially including the CMA itself – which runs a nanoprobe of its own.

There are some things about our problem that are worth understanding in order to better understand our particular problem.

  • There are two roles in our architecture
    • The CMA – central system which gives orders to nanoprobes throughout the enterprise.  Does not need root privileges.
    • Nanoprobes – agents running on ideally every system in the enterprise.  Many tasks it performs require root privileges.  Nanoprobes only take actions on request from the CMA.
  • There are potentially hundreds of thousands of nanoprobes managed by a single CMA instance.
  • Nanoprobes only communicate with the CMA when they have an exception to report.  It would be normal for a nanoprobe to not communicate with the CMA for weeks or months at a time.
  • All communication is via UDP with a reliable user-level protocol on top.
  • The CMA/nanoprobe communication is a Command, Control and Intelligence (C2I) protocol.  By design, information exchange is infrequent and low volume.
  • Our protocol avoids being noisy on the network.
  • Nanoprobes collect data which is used to create a detailed, comprehensive map of the enterprise.  The collection of all the data from all the nanoprobes is incredibly security sensitive, and the data from a single nanoprobe discovery result is potentially security sensitive.
  • Nanoprobe discovery data sometimes exceeds 100K bytes of uncompressed JSON.
  • One of the major functions of the CMA is to alert support staff of outages, problems and anomalies.

Here are a few things that come to mind from this list:

  • The existence of multiple roles corresponds nicely with public key cryptographic techniques.
  • Attackers would likely prefer to avoid things which might cause the CMA to report an anomaly.

It is my belief that attackers would likely have these priorities in attacking the Assimilation software.

  1. Subvert the nanoprobes over the wire – and bend them to their will.
  2. Obtain direct access to the data in the CMA database – or otherwise subvert the CMA directly.
  3. Obtain as much discovery data as possible over the wire.

This series of articles concentrates on the first and third items.

These facts lead us to have these two ordered priorities highest in our mind regarding cryptographic communication:

  1. Authenticating “command” communication from the CMA to nanoprobes – to avoid takeover of your servers by way of our ubiquitous nanoprobes.
  2. Keeping discovery communication confidential (nanoprobe to CMA) – to maintain the confidentiality of the discovery data.  Although this data is normally sent infrequently over the wire, and much of it is not-security sensitive, some of it is quite sensitive.  In addition, when systems reboot, they send all their discovery data to the CMA – and the CMA itself hears data from all machines.

This is a basic overview of a few facts concerning of our problem.  In the followup article, I go over how we are planning on approaching it – so we can get a detailed review of our overall approach. If you know any cryptography experts, then by all means invite them to come join the fun!

If this description has left questions in your mind – then let’s get going on a great conversation using the comment box below – and please read part four (and ignore the now-obsolete parts two and three).  As always, any future updates will be announced on our blog and on @OSSAlanR. If you have an aversion to comment forms – email alanr@unix.sh and I’ll incorporate them – I even have a GPG key [717A640E], or even better yet, join the Assimilation development mailing list here.

Please note: I reserve the right to delete comments that are offensive or off-topic.

Leave a Reply

You have to agree to the comment policy.

This site uses Akismet to reduce spam. Learn how your comment data is processed.