Runing Saltstack in multi-master


Recently I ran into some problems when trying to use multiple saltmasters in combination with using the mine and thought I'd share my experiences.

Background

 

Like puppet

 

The salt mine can be compared to the PuppetDB; it's a place to store stuff like (custom) facts for use on other nodes/instances. The classic use-case for this is a monitoring setup like Nagios that need to be configured for each additional deployed service/instance with the particular info of that service/instance. So let's say you deploy a new MySQL instance and automatically slave it to an existing cluster, you want the ip address, database name and maybe some other stuff to be configured on the monitoring server. Puppet does this by something called "exported resources" which basically sends data to a central location (the PuppetDB) which can then be collected by other machines when puppet runs there.

But different

 

Where Puppet uses a database (PostgreSQL by default) for storage, Salt by default just stores serialized (using msgpack) data in the local filesystem on the master (and, until recently, this was the only option).
Although this is a much simpler setup requiring fewer moving parts, the obvious downside is scalability; the local filesystem is just that: local. So when you want to use the salt mine in conjunction with a multi-master setup, you are going to run into trouble.....

The Problem

 

Decoupling the cache

 

If you want to use the mine functionality with multi-master setups, you have to decouple the mine cache/data from a specific salt master and make it available to all of them. The good news is that the developers recently added the functionality to plug in other storage to be used (as of 11 jan 2017) like consul or redis. The not so good news is that when I changed to consul it became apparent salt doesn't efficiently handle reads from this cache.

What's wrong

 

As I stated earlier, salt uses msgpack to store it's data. The addition of a pluggable storage backend didn't change that, it just made it possible to use something like consul instead of the filesystem. In practice that means that in consul (which in this case is basically just a key-value store) a key is created per minion with the serialized data (the facts if you will) as the value. This in itself is not really the best way of doing it in my opinion, it would be much cleaner to not serialize the data at all and just create a key-value pair per fact. The way it is implemented now, when salt needs to retrieve a fact like "hostname" it fetches the serialized object from consul, deserializes it, takes the "hostname" key from it and throws away the rest.
This problem is exacerbated by the fact that there is no in-memory caching by the saltmaster whatsoever. This means that everytime you do a lookup in your salt or pillar code, you hit this backend, which in turn fetches all the data and immediately discards the rest. Now take into account that serialized object containing all the facts is ~20kb and you may begin to see the problem.

That's a lot of data

 

To illustrate the problem, let's assume that, like us, you do ~25 lookups in your highstate (which, if you have a large number of formulae is not that extreme imo) per minion and you have 200 minions (which is a pretty small setup, keep in mind saltstack prides itself on it's scalability in the thousands).
When we do "salt '*' state.apply" we get a total of 5000 lookups (== hits on consul) of 20kb each totalling ~100MB per minion because every minion iterates over all the data. That makes for a total of 2Gb of data being fetched from the cache. Ouch :(

The solution

 

As of now I see 2 (which are not mutually exclusive) ways of going forward and are described in a github issue:

1) make it so that in the context of 1 run we only fetch the data 1 time and keep it in memory until the run is finished
2) Do not store a serialized dictionary as a value in a key-value store.

Until this is fixed, I see no realistic way of using the mine with a multi-master setup.

Update

This issue seems to be addressed in  https://github.com/saltstack/salt/issues/40429. I haven't tested it though, but nice to see it's being picked up.

Comments

  1. Yout article really helped. Thanks for sharing this info.

    ReplyDelete
  2. Hello Yoram,

    Is there a way i connect with you , its regarding one of your projects - connecting Ansible with DB2 - just curious if you could share some info on that?

    ReplyDelete

Post a Comment

Popular posts from this blog

Recursively update Composite Content Views in Satellite 6/Katello

First!