Extending Ansible – plugins, part 2

Under the hood of d2c.io service we use Ansible a lot: from cloud VM creation and provisioning to Docker containers and user apps orchestration.

In the previous article we made an overview of plugin types supported by Ansible and created several own plugins: test, filter, action and callback. In this article we will dive deeper…

Константин Суворов
Ansible ninja

Callback with «mutation»

The most common use cases for callbacks are logging and notification – we discussed such a plugin in the previous article. But you can influence playbook workflow with callback plugins too, not just passively monitor for events.

At D2C we extensively use task tags to be able to execute only specific tasks within roles. For example, build tag to run all tasks to build service from scratch and update-config tag to run only configuration-related tasks. Out of the box Ansible can apply single set of tags to whole playbook. It is not convinient for some use cases.

Let’s look at a process of setting Master-Slave replication for MySQL:

  • update configuration of standalone server to become primary
  • make a database dump
  • provision second server, setup replication as slave
  • restore database dump on the slave
  • clean up dump and any remaining temporary data

Every task from this list has it’s own tags. We can make a single plugin with three plays: prepare master, create slave, clean up. But we can’t apply different set of tags for different plays, because we can provide only a single tags argument for a playbook! Let’s make our scenario possible with the following callback plugin:

from ansible.plugins.callback import CallbackBase
from ansible.parsing.yaml.objects import AnsibleUnicode
from ansible.compat.six import string_types

import json
import os

class CallbackModule(CallbackBase):

    CALLBACK_VERSION = 2.0
    CALLBACK_NAME = 'use_tags'

    def __init__(self):
        super(CallbackModule, self).__init__()
        self.tmp_context = None
        self.warn = False if os.environ.get('ANSIBLE_D2C_NO_WARN') else True

    def v2_playbook_on_play_start(self, play):

        vm = play.get_variable_manager()

        extra_vars = vm.extra_vars
        enable_use_tags = False
        if 'enable_use_tags' in extra_vars:
            if extra_vars['enable_use_tags']:
                enable_use_tags = True

        play_vars = vm.get_vars(play._loader, play=play)

        if enable_use_tags:
            tags = self.tmp_context.only_tags
            tags.clear()
            if 'use_tags' in play_vars:
                use_tags = play_vars['use_tags']
                if isinstance(use_tags, (string_types, AnsibleUnicode)):
                    use_tags = [t.strip() for t in use_tags.split(',')]
                if isinstance(use_tags, list):
                    for t in use_tags:
                        tags.add(t)
                else:
                    tags.add('all')
                    self._display.display(' [INFO]: "use_tags" variable is set, but unparsable (type "{}" is not a list or a string): {}'.format(type(use_tags),use_tags), color='cyan')
            else:
                self._display.display(' [INFO]: "use_tags" variable is not set, but "enable_use_tags" is set', color='cyan')
                tags.add('all')
            if self.warn:
                self._display.warning('Tags modified to: {}'.format(json.dumps(list(tags))))

    def set_play_context(self, play_context):
        self.tmp_context = play_context

As you can see we override v2_playbook_on_play_start method – it’s called after play initialization and before tasks execution.

We use extra variable enable_use_tags as a flag whether we will modify tags in runtime and set them from play variable use_tags for each play.

Tags and other runtime information is stored inside PlayContext object, but it is not available as v2_playbook_on_play_start parameter. To overcome this we can note that Ansible queue manager checks for set_play_context method presence and calls it with PlayContext as parameter.

Knowing the fact that PlayContext object is mutable and that Ansible always operates with a single play at any given time we can make the following:

  • define tmp_context helper inside our plugin class
  • save reference to current context inside set_play_context method using tmp_context
  • inside v2_playbook_on_play_start we can check enable_use_tags and use_tagsand modify PlayContext (to be precise we get mutable list of tags via self.tmp_context.only_tags and modify it)
  • print modification warnings (to inform unaware users)

With this plugin installed we can do the following:

ansible-playbook -e enable_use_tags=1 make_mysql_slave.yml

- hosts: master
  vars:
    use_tags: update-configs, replication-init, replication-sync
  roles:
    - mysql
- hosts: slave
  vars:
    use_tags: build, replication-init, replication-sync
  roles:
    - mysql
- hosts: all
  vars:
    use_tags: replication-sync-cleanup
  roles:
    - mysql

This way Ansible will use different tags for each play. So it allows us to deliver complex orchestration scenarios as single playbooks!

Connection

These plugins are used to make connections with different target hosts. As a very simplified overview, connection plugin must provide methods to connect/disconnect remote host, send file, execute remote command. Out of the box examples of connection plugins: localssh (the default), winrmdocker.

If you have unique target systems (e.g. proprietary virtualisation system) you will have to write a plugin from scratch. But if you are in a situation when you need to add some minor functions to existing plugin, you can subclass out-of-the-box plugin and override required methods.

Let’s take SSH-connection with port knoking enabled. It is an ordinary ssh session but the fact that you have to “knock” some port sequence before attempting to connect to port 22 (otherwise server will reject your connection).

Here is ssh plugin modification (place into ./connection_plugins/ssh_pkn.py):

from ansible.plugins.connection.ssh import Connection as ConnectionSSH
from ansible.errors import AnsibleError
from socket import create_connection
from time import sleep

try:
    from __main__ import display
except ImportError:
    from ansible.utils.display import Display
    display = Display()

class Connection(ConnectionSSH):

    def __init__(self, *args, **kwargs):

        super(Connection, self).__init__(*args, **kwargs)
        display.vvv("SSH_PKN (Port KNock) connection plugin is used for this host", host=self.host)

    def set_host_overrides(self, host, hostvars=None):

        if 'knock_ports' in hostvars:
            ports = hostvars['knock_ports']
            if not isinstance(ports, list):
                raise AnsibleError("knock_ports parameter for host '{}' must be list!".format(host))

            delay = 0.5
            if 'knock_delay' in hostvars:
                delay = hostvars['knock_delay']

            for p in ports:
                display.vvv("Knocking to port: {0}".format(p), host=self.host)
                try:
                    create_connection((self.host, p), 0.5)
                except:
                    pass
                display.vvv("Waiting for {0} seconds after knock".format(delay), host=self.host)
                sleep(delay)

We use set_host_overrides method which allows plugins to alter its behavior based on host/group variables. This method is called only for new connections, so in case of reusable ssh sessions we won’t “knock” too much.

Here’s inventory example for this plugin:

[pkn]
myserver ansible_host=my.server.at.example.com
[pkn:vars]
ansible_connection=ssh_pkn
knock_ports=[8000,9000]
knock_delay=2

We tell Ansible to use ssh_pkn connection plugin for every host in pkn group. Then set_host_overrides method will “find” knock_ports inside hostvars and trigger tcp connection attempt for each port from knock_ports list with knock_delay delay (we use try/except block with generic socket.create_connection method, because raw sockets for sending just SYN packets would reuire root permissions for ansible user which is unacceptable).

Strategy

Strategy plugins define task execution order for remote hosts and also do a lot of local under-the-hood job – dynamic facts, host states (healty/failed/unreachable), callbacks. You can check an overview of out of the box strategy plugins in the previous article.

There’s a handful of strategy plugins on the Internet. For example there is rejected pull-request 18460 with strategy plugin that allows to inject tasks (the idea was to allow flexible customization of base playbook).

Let’s create a variation of linear plugin (place into ./strategy_plugins/step_critical.py):

from ansible.plugins.strategy.linear import StrategyModule as LinearStrategyModule
import os

try:
    from __main__ import display
except ImportError:
    from ansible.utils.display import Display
    display = Display()

class StrategyModule(LinearStrategyModule):

    def __init__(self, tqm):
        super(StrategyModule, self).__init__(tqm)
        display.vv('Safenet strategy: will give a prompt at critical tasks!')
        force_step = os.environ.get('ANSIBLE_FORCE_STEP', None)
        if force_step and force_step.lower() in ['1','y','yes','true','on']:
            display.vv('Safenet: "step" option is forced via environment!')
            self._step = True

    def _take_step(self, task, host=None):

        v = task.get_vars()
        ret = True
        if 'is_critical' in v:
            if v['is_critical']:
                display.vv('Safenet: critical task detected!')
                return super(StrategyModule, self)._take_step(task, host)
        return ret

This plugin changes Ansible’s behaviour with --step option in the following way: Ansible will ask to pocceed only for tasks marked with is_critical variable set to True, but not for every task as it does with default linear strategy.

Additionaly we check for ANSIBLE_FORCE_STEP environment variable to enable “step mode” (apart from --step option).

You can test this strategy with the following playbook:

---
- hosts: localhost
  strategy: step_critical
  gather_facts: no
  tasks:
    - name: Ensure user exists
      debug:
        msg: user_module
    - name: Drop database
      debug:
        msg: db_module
      vars:
        is_critical: yes
    - name: Ensure permissions
      debug:
        msg: permission_module

Let’s make a checkpoint here. We’ve covered all plugin types available in Ansible 2.3 in two articles about extending Ansible. You have examples for most part of plugin types to play with.

Send us your questions or ideas on follow-up articles about plugins.

In the mean time we begin to prepare an article about extendind Ansible with custom modules. Stay tuned…