Intro

During a recent project I was tasked with an interesting challenge. A fully automated deployment pipeline which was initially based upon SUSE Manager [3] 2.x was to be migrated to Uyuni [4]. During the course of the project we also migrated a good portion of custom automation code we developed to a Salt-based approach with states and reactors/beacons. This blog post describes how we use Salt to create virtual machines on automatically deployed hypervisors.

Scenario

The customer is a multi-national company with retail stores in many cities. For every store a server with a custom configuration needs to be deployed and shipped out. In order to simplify replacements and hardware supportability the ‘server’ itself is not installed on the bare metal but as a virtual machine inside a Xen-based [5] hypervisor. But this is also an overhead because every server needs to be installed ‘twice’. First, we install the Xen Hypervisor (based on OpenSUSE Leap [2]) and directly after that we create and install the virtual machine (also based on OpenSUSE Leap) which will run workload as specified by the customer.

The nitty-gritty details

The system we developed features a few core components. One is a central API bridge, which connects different aspects of the system, for example infrastructure services like DNS, Salt, the customers internal ERP/CRM as well as Uyuni itself. Another component is a custom web interface through which a user may initiate or cancel a deployment. After the user starts a new deployment process our API bridge talks to the Uyuni/cobbler [13] API, which then creates a custom PXE boot configuration for the specified host. The target system will then initiate the automatic installation (based on AutoYaST [10]). During and after the installation the system communicates with our API bridge in order to inform it on the current status, which will then coordinate and initiate the following steps. After reaching its ‘high state’ (Salt-speak, meaning the system enforced the desired state in terms of configuration) we start the deployment of the virtual machine, which will then function as the central server for the given retail store. Before the migration project we would create the virtual machine through the integrated SUSE Manager XMLRPC-API [11] and initiate another AutoYaST-based installation. This worked fine, but there surely was room for improvement, for example in terms of installation time.

Salt to the rescue

With salt we are able to facilitate a powerful and flexible automation platform to achieve our goals and simplify the infrastructure for the deployment process. Instead of relying on cobbler or the SUSE Manager APIs we are using Salt reactors [7] and salt-virt [8] to accelerate the preparation of the central store servers. After the hypervisor finished installation and was successfully configured for Xen we send a notification to the Salt message bus. This triggers our API bridge, which will then use salt-virt to create the virtual machine. The following snippets show a over-simplified version of the code in use, which runs inside the API bridge on the Salt master.

First, we import the necessary Python modules:

import salt.client
import salt.wheel
import salt.config
import logging

Then, we initialize a LocalClient in order to talk to salt and execute commands, as well as configure the logging module:

# You should cancel the process if this fails
cl = salt.client.LocalClient()

# simple logging bootstrap for this example
LOGLEVEL = getattr(logging, "DEBUG", None)
if not isinstance(LOGLEVEL, int):
    raise ValueError("Invalid Loglevel: %s (try: INFO or DEBUG)" % loglevel)

log_format = "%(asctime)s [%(levelname)-8s] %(message)s"
LOG = logging.getLogger(__name__)
LOG.setLevel(LOGLEVEL)

# create a console log handler and configure it
ch = logging.StreamHandler()
ch.setLevel(LOGLEVEL)
formatter = logging.Formatter(log_format)
ch.setFormatter(formatter)
LOG.addHandler(ch)

Now we define how the virtual machine should be configured. In the production code we also handle custom pillar data for the new minion, which is later used to help configure the customer workload. Additionally we create an autosign entry for the new virtual machines in order to automate the whole Salt bootstrap procedure.

hypervisor = "myhypervisor.example.com"
vm_name = "vm1.example.com"

# define a disk four our VM
disk_definition= [
    {
        "image": "http://internal_obs.example.com/vm_image.raw",
        "name": "system",
        "pool": "default",
        "size": 32 * 1024 # Disk size in MiB
    }
]

# configure VNC, for remote console view
# in addition to serial console
graphics_definition = {
    "type": "vnc",
    "listen": {
        "type": "address",
        "address": "localhost"
    }
}

# Complete configuration
init_kwargs = {
    'serial_type': 'pty',
    'console': True,
    'disks': disk_definition,
    'images': "/var/my_virtualmachines/",
    'connection': 'xen:///',
    'seed': True,
    'graphics': graphics_definition,
    'start': False,
}

This section purges pre-existing minion keys. The salt.wheel [9] module is useful for all the different aspects of key management. First, we load the current configuration for the salt-master, which is then used by the instance of salt.wheel.WheelClient. Now we are able to list and manage minion keys on our master.

reset_salt_key = False
try:
    # Try to purge old minion key if reset_salt_key is set to True
    # A failure here should not jeopardize the whole process. We can
    # always manually fix key issues later on
    # more info on WheelClient:
    # https://docs.saltstack.com/en/latest/ref/wheel/all/salt.wheel.key.html
    if reset_salt_key:
        LOG.debug("Removing old minion key for %s", vm_name)
        master_opts = salt.config.master_config('/etc/salt/master')
        wheel = salt.wheel.WheelClient(master_opts)
        ret=wheel.cmd('key.delete', [vm_name,])
        LOG.info("Removed old minion key for %s", vm_name)
    else:
        LOG.info("Not removing old minion key for %s", vm_name)
except Exception as e:
    LOG.error(
        "Failed to remove old minion key for %s, continuing", vm_name
    )

And now on to the creation of the virtual machine itself. The virt.init command takes the complete definition of our virtual machine and uses libvirt to create it. Because of certain race conditions we define the virtual machine and then try several times to start it up (virt.start):

# Using start=False and a second command to power up the VM
# because start=True failed reproducebly when called via
# commandline. There is a nasty bug hidden somewhere but this
# workaround simplifies this project for now
# more info at
# https://docs.saltstack.com/en/latest/topics/virt/index.html

GUEST_VCPUS=4 # This would be calculated based on the hypervisor hardware
GUEST_MEMORY=8192 # This as well

ret=cl.cmd(
    hypervisor,
    'virt.init',
    [ vm_name, GUEST_VCPUS, GUEST_MEMORY ],
    kwarg=init_kwargs,
    full_return=True
)
LOG.debug('virt.init/return value from salt: %s', ret)
if ret[hypervisor]['retcode'] != 0:
    raise RuntimeError(
        "Failed to create virtual machine"
    )

max_tries=2
tries=0
for t in range(max_tries):
    ret=cl.cmd(
        hypervisor,
        'virt.start',
        [vm_name,],
        full_return=True
    )
    LOG.debug('virt.start/try %i,return value from salt: %s', t, ret)
    if ret[hypervisor]['retcode'] == 0:
        break
    tries=tries+1
    if tries >= max_tries:
      raise RuntimeError(
          "Failed to create virtual machine"
      )
    time.sleep(5)

LOG.info('Created VM %s on hypervisor %s', vm_name, hypervisor)

The base of the virtual machine is rather static, as we do not differentiate between different hardware platforms. Because of that we chose to use an image-based deployment approach in order to speed up the deployment tremendously.

We based our image on the default JeOS (Just enough operating system) Kiwi [6] definition from the OpenSUSE project (available here [12]). We adjust the package selection (to include the salt-minion package, for example) and configure Salt:

# This is part of config.sh

## B1 SALT BOOTSTRAP CONFIG
cat <<EOF >/etc/salt/minion.d/uyuni.conf
server_id_use_crc: adler32
enable_legacy_startup_events: False
enable_fqdns_grains: False
autosign_grains:
  - id
grains:
  susemanager:
    activation_key: 1-opensuse_leap_15.1_xenguest
EOF

# Apply a special config to the Salt minion which helps with bootstrapping
# It should be deleted after the bootstrap is done!
cat <<EOF >/etc/salt/minion.d/b1-bootstrap.conf
startup_states: sls
sls_list:
  - provisioning_helper.firstboot
EOF

## END B1 SALT BOOTSTRAP CONFIG

The provisioning_helper.firstboot state handles the initial bootstrap of the machine. This part is crucial because we need to do the following:

  1. Reboot the system
  2. Apply the ‘Highstate’ of the virtual machine
  3. Inform the deployment system about the status
  4. Reboot the system and verify its startup
  5. Remove bootstrap data

Only if we were able to enforce the desired state (as defined in Salt) and rebooted the VM successfully we may complete the deployment process and inform the user.

The complete process on the high level works something like this:

Eigene Darstellung, CC-BY-SA
Eigene Darstellung, CC-BY-SA

Conclusion

Salt is a really flexible platform that can be used to automate a multitude of processes. Starting with some custom Python code is easy and the states and reactors system is really robust. The message bus of Salt is really cool and makes it easy to react on things happening in your infrastructure. This was my first real deep dive into Salt (besides the usual config management with states and reactors) and I can say that I love it, besides some quirks that surface here and there. I guess that’s just the way it is with every non-trivial software project.

Mattias Giese
Mattias Giese is working as a Linux consultant and trainer at B1 Systems. If he isn't involved in systems management and automation projects where he works with a plethora of different tools that he glues together to create efficient workflows, he likes to mess around with chic mechanical keyboards and adjust the configuration of his tools in order to achieve zen laziness.

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.