Ansible is quickly becoming the de facto configuration management tool today. It's very easy to use, has a ton of modules, does not need a management server; as it can be run right from your laptop. The only requirement that has to be fulfilled on the target machine is a modern version of Python. But what if Python is not (yet) installed?
But I thought Python ships with Linux by default
Yes, you're correct. All major Linux distribution have Python 2 or 3 or both installed by default. However, if you managed to run an AWS EC2 machine using one of the Ubuntu AMIs (in this example it's ami-0bbe6b35405ecebdb), you'll notice that Python is not installed.
OK, I will just install it and then run Ansible
That might be a perfect choice if your project is hosted on one or two machines. But, what if you have a dozen or more? having to manually login to each and every host and run a
sudo apt install python is opposite to the definition of automation. A DevOps engineer should have a better option.
The raw module
Ansible is a Python library just like
numpy and others. Any module that Ansible uses against the target host uses Python internal to make the required changes. But what if Python is not installed in the first place? Fortunately, Ansible has the
raw module. If you have a look at the module's page here https://docs.ansible.com/ansible/devel/modules/raw_module.html, you'll see the official definition:
This is useful and should only be done in a few cases. A common case is installing python on a system without python installed by default.
Preparing Ansible to connect to the EC2 instance
Before starting to write the playbook, we'll need to make some changes to how Ansible connects to remote hosts.
Create a new file in your current directory and call it
ansible.cfg. By default, Ansible will read this file and obey any setting in it, overriding the defaults defined in
/etc/ansible/ansible.cfg. My file looks as follows:
[defaults] host_key_checking = False private_key_file = ~/.ssh/mykey.pem remote_user = ubuntu
The first line avoids SSH key checking. If you use SSH to connect to a host for the first time, you'll see a warning that the system is going to add the SSH key of the remote host to the list of known hosts and you have to type
yes to establish connection. When dealing with lots of hosts in a security-contained environment, such a prompt will waste a lot of time unecessarily.
The second line defines the location of the SSH private key used to connect to the remote host.
The last line instructs Ansible to use
ubuntu as the username when connecting to the remote host. By default, Ansible will use the currently-logged username to establish connection.
Now, we need to create the
hosts file. I've created a group called
development containing my EC2 instance IP address:
Running the playbook
Now, the important part. The playbook that will be used for configuring our target host:
- hosts: development become: yes gather_facts: no pre_tasks: - name: 'install python' raw: 'sudo apt-get -y install python'
Notice that you MUST set
noif Python is not installed yet; as this setting uses Python to gather information about the target host.
When we need to install Python, this must be specified in the
pre_tasks section of the playbook. You will want this specific task to always be the first thing that Ansible does once it connects to the remote host. This becomes even more important if you will use Ansible roles because any tasks defined in the role will be executed first before Ansible examines the tasks in the playbook.
raw module comes to action. Obviously, we are using a straight Ubuntu command,
apt. We're even using
sudo to make sure that the operation will be carried out with the
root privileged account even that we already set
Running the playbook is as easy as issuing the following command:
ansible-playbook -i hosts deploy.yml
Double check that Python got installed by logging in to the host and running
Drawbacks of this approach
Ansible was designed so that it uses Python almost all of its work. Using the
raw module has some disadvantages of its own:
- It is not idempotent. Configuration management tools are meant to be run thousands of times without making any further changes that have been already made to the target machines. This is called idempotence. The
rawmodule of course does not enjoy this so each time this playbook runs, the
sudo apt install pythoncommand will be run unnecessarily.
- You cannot gather important information about the target machine. The
gather_factssetting can be very important if you intend to use machine-related information later in the playbook. It has to be set to no if the playbook is going to be used to install Python.
Possible workarounds for the drawbacks
You can create a separate playbook that will just make sure that all target machines have Python installed. Subsequent Ansible tasks can be added to another playbook where
gather_facts can be turned on and the Python installation command won't be run over and over.
Another workaround may not be directory related to Ansible: you can create your own AMI image that has Python installed and use this image when you need to spawn any new EC2 instances on which Ansible is going to be used.