2013-09-29 20:22:54 +00:00
Delegation, Rolling Updates, and Local Actions
==============================================
2012-05-13 15:00:02 +00:00
2013-12-26 19:32:01 +00:00
.. contents :: Topics
2013-10-05 16:49:42 +00:00
Being designed for multi-tier deployments since the beginning, Ansible is great at doing things on one host on behalf of another, or doing local steps with reference to some remote hosts.
2013-10-03 02:02:11 +00:00
2014-04-27 19:07:09 +00:00
This in particular is very applicable when setting up continuous deployment infrastructure or zero downtime rolling updates, where you might be talking with load balancers or monitoring systems.
2013-10-03 02:02:11 +00:00
2013-10-05 16:49:42 +00:00
Additional features allow for tuning the orders in which things complete, and assigning a batch window size for how many machines to process at once during a rolling update.
2013-10-03 02:02:11 +00:00
2015-07-11 15:33:28 +00:00
This section covers all of these features. For examples of these items in use, `please see the ansible-examples repository <https://github.com/ansible/ansible-examples/> `_ . There are quite a few examples of zero-downtime update procedures for different kinds of applications.
2013-10-05 16:49:42 +00:00
You should also consult the :doc: `modules` section, various modules like 'ec2_elb', 'nagios', and 'bigip_pool', and 'netscaler' dovetail neatly with the concepts mentioned here.
2017-06-06 21:39:48 +00:00
You'll also want to read up on :doc: `playbooks_reuse_roles` , as the 'pre_task' and 'post_task' concepts are the places where you would typically call these modules.
2012-05-13 15:00:02 +00:00
2013-10-04 22:34:39 +00:00
.. _rolling_update_batch_size:
2013-09-29 20:22:54 +00:00
Rolling Update Batch Size
2012-10-16 22:58:31 +00:00
`` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ``
2012-08-18 14:23:17 +00:00
.. versionadded :: 0.7
2013-07-10 19:09:12 +00:00
By default, Ansible will try to manage all of the machines referenced in a play in parallel. For a rolling updates
use case, you can define how many hosts Ansible should manage at a single time by using the ''serial'' keyword::
2012-10-16 23:12:31 +00:00
2012-08-18 14:23:17 +00:00
- name: test play
hosts: webservers
serial: 3
2012-10-16 23:12:31 +00:00
In the above example, if we had 100 hosts, 3 hosts in the group 'webservers'
would complete the play completely before moving on to the next 3 hosts.
2012-08-18 14:23:17 +00:00
2014-08-08 17:46:44 +00:00
The ''serial'' keyword can also be specified as a percentage in Ansible 1.8 and later, which will be applied to the total number of hosts in a
2014-04-10 12:57:39 +00:00
play, in order to determine the number of hosts per pass::
- name: test play
2016-09-12 12:15:07 +00:00
hosts: webservers
2014-04-10 12:57:39 +00:00
serial: "30%"
If the number of hosts does not divide equally into the number of passes, the final pass will contain the remainder.
2016-08-04 05:05:30 +00:00
As of Ansible 2.2, the batch sizes can be specified as a list, as follows::
- name: test play
hosts: webservers
serial:
- 1
- 5
- 10
In the above example, the first batch would contain a single host, the next would contain 5 hosts, and (if there are any hosts left),
every following batch would contain 10 hosts until all available hosts are used.
2017-04-28 20:47:35 +00:00
It is also possible to list multiple batch sizes as percentages::
2016-08-04 05:05:30 +00:00
- name: test play
hosts: webservers
serial:
- "10%"
- "20%"
- "100%"
You can also mix and match the values::
- name: test play
hosts: webservers
serial:
- 1
- 5
- "20%"
2014-04-10 12:57:39 +00:00
.. note ::
No matter how small the percentage, the number of hosts per pass will always be 1 or greater.
2013-10-04 22:34:39 +00:00
.. _maximum_failure_percentage:
2013-09-06 20:19:34 +00:00
Maximum Failure Percentage
`` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ``
.. versionadded :: 1.3
By default, Ansible will continue executing actions as long as there are hosts in the group that have not yet failed.
2013-10-06 01:19:19 +00:00
In some situations, such as with the rolling updates described above, it may be desirable to abort the play when a
certain threshold of failures have been reached. To achieve this, as of version 1.3 you can set a maximum failure
2013-09-06 20:19:34 +00:00
percentage on a play as follows::
- hosts: webservers
max_fail_percentage: 30
serial: 10
In the above example, if more than 3 of the 10 servers in the group were to fail, the rest of the play would be aborted.
.. note ::
The percentage set must be exceeded, not equaled. For example, if serial were set to 4 and you wanted the task to abort
when 2 of the systems failed, the percentage should be set at 49 rather than 50.
2013-10-04 22:34:39 +00:00
.. _delegation:
2012-08-18 14:23:17 +00:00
Delegation
`` ` ` ` ` ` ` ``
.. versionadded :: 0.7
2013-09-29 20:22:54 +00:00
This isn't actually rolling update specific but comes up frequently in those cases.
2012-08-18 14:23:17 +00:00
If you want to perform a task on one host with reference to other hosts, use the 'delegate_to' keyword on a task.
2017-06-02 14:38:42 +00:00
This is ideal for placing nodes in a load balanced pool, or removing them. It is also very useful for controlling outage windows.
Be aware that it does not make sense to delegate all tasks, debug, add_host, include, etc always get executed on the controller.
Using this with the 'serial' keyword to control the number of hosts executing at one time is also a good idea::
2012-08-18 14:23:17 +00:00
---
2014-02-28 19:18:44 +00:00
2012-08-18 14:23:17 +00:00
- hosts: webservers
serial: 5
tasks:
2014-02-28 19:18:44 +00:00
2012-08-18 14:23:17 +00:00
- name: take out of load balancer pool
2013-07-15 17:50:48 +00:00
command: /usr/bin/take_out_of_pool {{ inventory_hostname }}
2012-08-18 14:23:17 +00:00
delegate_to: 127.0.0.1
- name: actual steps would go here
2013-07-15 17:50:48 +00:00
yum: name=acme-web-stack state=latest
2012-08-18 14:23:17 +00:00
- name: add back to load balancer pool
2013-07-15 17:50:48 +00:00
command: /usr/bin/add_back_to_pool {{ inventory_hostname }}
2012-08-18 14:23:17 +00:00
delegate_to: 127.0.0.1
2012-08-20 02:04:18 +00:00
2014-02-28 19:18:44 +00:00
These commands will run on 127.0.0.1, which is the machine running Ansible. There is also a shorthand syntax that you can use on a per-task basis: 'local_action'. Here is the same playbook as above, but using the shorthand syntax for delegating to 127.0.0.1::
2012-08-20 02:04:18 +00:00
---
2014-02-28 19:18:44 +00:00
2012-08-20 02:04:18 +00:00
# ...
2014-02-28 19:18:44 +00:00
2012-08-20 02:04:18 +00:00
tasks:
2014-02-28 19:18:44 +00:00
2012-08-20 02:04:18 +00:00
- name: take out of load balancer pool
2013-04-12 22:21:09 +00:00
local_action: command /usr/bin/take_out_of_pool {{ inventory_hostname }}
2012-08-20 02:04:18 +00:00
# ...
- name: add back to load balancer pool
2013-04-12 22:21:09 +00:00
local_action: command /usr/bin/add_back_to_pool {{ inventory_hostname }}
2012-08-20 02:04:18 +00:00
2013-03-26 20:34:16 +00:00
A common pattern is to use a local action to call 'rsync' to recursively copy files to the managed servers.
Here is an example::
---
# ...
tasks:
2014-02-28 19:18:44 +00:00
2013-03-26 20:34:16 +00:00
- name: recursively copy files from management server to target
2013-04-12 22:21:09 +00:00
local_action: command rsync -a /path/to/files {{ inventory_hostname }}:/path/to/target/
2013-03-26 20:34:16 +00:00
2013-03-27 02:37:06 +00:00
Note that you must have passphrase-less SSH keys or an ssh-agent configured for this to work, otherwise rsync
will need to ask for a passphrase.
2016-11-08 14:07:19 +00:00
The `ansible_host` variable (`ansible_ssh_host` in 1.x or specific to ssh/paramiko plugins) reflects the host a task is delegated to.
2016-04-26 15:18:06 +00:00
2015-12-08 22:18:11 +00:00
.. _delegate_facts:
Delegated facts
`` ` ` ` ` ` ` ` ` ` ` ` ``
.. versionadded :: 2.0
2015-12-09 16:44:09 +00:00
By default, any fact gathered by a delegated task are assigned to the `inventory_hostname` (the current host) instead of the host which actually produced the facts (the delegated to host).
In 2.0, the directive `delegate_facts` may be set to `True` to assign the task's gathered facts to the delegated host instead of the current one.::
2015-12-08 22:18:11 +00:00
- hosts: app_servers
tasks:
- name: gather facts from db servers
setup:
delegate_to: "{{item}}"
delegate_facts: True
2016-02-22 16:07:48 +00:00
with_items: "{{groups['dbservers']}}"
2015-12-08 22:18:11 +00:00
2015-12-09 16:44:09 +00:00
The above will gather facts for the machines in the dbservers group and assign the facts to those machines and not to app_servers.
2017-04-19 13:49:36 +00:00
This way you can lookup `hostvars['dbhost1']['default_ipv4']['address']` even though dbservers were not part of the play, or left out by using `--limit` .
2015-12-08 22:18:11 +00:00
2014-05-15 15:47:17 +00:00
.. _run_once:
Run Once
`` ` ` ` ` ``
2014-09-02 09:42:32 +00:00
.. versionadded :: 1.7
2014-05-15 15:47:17 +00:00
In some cases there may be a need to only run a task one time and only on one host. This can be achieved
by configuring "run_once" on a task::
---
# ...
tasks:
# ...
- command: /opt/application/upgrade_db.py
run_once: true
# ...
This can be optionally paired with "delegate_to" to specify an individual host to execute on::
- command: /opt/application/upgrade_db.py
run_once: true
delegate_to: web01.example.org
When "run_once" is not used with "delegate_to" it will execute on the first host, as defined by inventory,
2015-12-10 15:22:37 +00:00
in the group(s) of hosts targeted by the play - e.g. webservers[0] if the play targeted "hosts: webservers".
2014-05-15 15:47:17 +00:00
2015-12-10 15:22:37 +00:00
This approach is similar to applying a conditional to a task such as::
2014-05-15 15:47:17 +00:00
- command: /opt/application/upgrade_db.py
when: inventory_hostname == webservers[0]
2015-12-10 15:22:37 +00:00
.. note ::
2016-06-01 20:36:18 +00:00
When used together with "serial", tasks marked as "run_once" will be run on one host in *each* serial batch.
2015-12-10 15:22:37 +00:00
If it's crucial that the task is run only once regardless of "serial" mode, use
2016-09-23 15:17:46 +00:00
:code:`when: inventory_hostname == ansible_play_hosts[0]` construct.
2015-12-10 15:22:37 +00:00
2013-10-04 22:34:39 +00:00
.. _local_playbooks:
2013-09-29 20:47:34 +00:00
Local Playbooks
`` ` ` ` ` ` ` ` ` ` ` ` ``
It may be useful to use a playbook locally, rather than by connecting over SSH. This can be useful
2016-06-01 20:36:18 +00:00
for assuring the configuration of a system by putting a playbook in a crontab. This may also be used
2014-05-03 15:59:50 +00:00
to run a playbook inside an OS installer, such as an Anaconda kickstart.
2013-09-29 20:47:34 +00:00
2015-02-09 11:05:21 +00:00
To run an entire playbook locally, just set the "hosts:" line to "hosts: 127.0.0.1" and then run the playbook like so::
2013-09-29 20:47:34 +00:00
ansible-playbook playbook.yml --connection=local
Alternatively, a local connection can be used in a single playbook play, even if other plays in the playbook
use the default remote connection type::
- hosts: 127.0.0.1
connection: local
2015-07-30 09:16:55 +00:00
.. _interrupt_execution_on_any_error:
Interrupt execution on any error
`` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ``
2016-11-16 18:44:51 +00:00
With the ''any_errors_fatal'' option, any failure on any host in a multi-host play will be treated as fatal and Ansible will exit immediately without waiting for the other hosts.
2015-07-30 09:16:55 +00:00
2016-11-16 18:44:51 +00:00
Sometimes ''serial'' execution is unsuitable; the number of hosts is unpredictable (because of dynamic inventory) and speed is crucial (simultaneous execution is required), but all tasks must be 100% successful to continue playbook execution.
2015-07-30 09:16:55 +00:00
2016-11-16 18:44:51 +00:00
For example, consider a service located in many datacenters with some load balancers to pass traffic from users to the service. There is a deploy playbook to upgrade service deb-packages. The playbook has the stages:
2015-07-30 09:16:55 +00:00
- disable traffic on load balancers (must be turned off simultaneously)
2016-11-16 18:44:51 +00:00
- gracefully stop the service
- upgrade software (this step includes tests and starting the service)
- enable traffic on the load balancers (which should be turned on simultaneously)
2015-07-30 09:16:55 +00:00
2016-11-16 18:44:51 +00:00
The service can't be stopped with "alive" load balancers; they must be disabled first. Because of this, the second stage can't be played if any server failed in the first stage.
2015-07-30 09:16:55 +00:00
2016-11-16 18:44:51 +00:00
For datacenter "A", the playbook can be written this way::
2015-07-30 09:16:55 +00:00
---
- hosts: load_balancers_dc_a
any_errors_fatal: True
tasks:
- name: 'shutting down datacenter [ A ]'
command: /usr/bin/disable-dc
- hosts: frontends_dc_a
tasks:
- name: 'stopping service'
command: /usr/bin/stop-software
- name: 'updating software'
command: /usr/bin/upgrade-software
- hosts: load_balancers_dc_a
tasks:
- name: 'Starting datacenter [ A ]'
command: /usr/bin/enable-dc
2016-11-16 18:44:51 +00:00
In this example Ansible will start the software upgrade on the front ends only if all of the load balancers are successfully disabled.
2015-07-30 09:16:55 +00:00
2013-10-05 16:31:16 +00:00
.. seealso ::
:doc: `playbooks`
An introduction to playbooks
2015-07-11 15:33:28 +00:00
`Ansible Examples on GitHub <https://github.com/ansible/ansible-examples> `_
2013-10-05 16:49:42 +00:00
Many examples of full-stack deployments
2013-10-05 16:31:16 +00:00
`User Mailing List <http://groups.google.com/group/ansible-devel> `_
Have a question? Stop by the google group!
`irc.freenode.net <http://irc.freenode.net> `_
#ansible IRC chat channel
2013-09-29 20:47:34 +00:00