How Does Ansible Work ?
Imagine that you are a Linux System Administrator, working for a small financial services company. Your company being in the finance sector, is always worried, about security of the applications and servers.
Your company recently took a critical step of firing an engineer on the charges of leaking financial and business critical data to an outsider.
Let us assume that you are now assigned with a task to remove and disable the recently fired employee's user on all the servers in your critical environment. This being a very critical and security related stuff, you do not have much time at hand to plan this out. There are around 50 servers in your environment, which should be patched asap.
Being from a pure Linux background and not having much prior experience with configuration management tools like Chef & Puppet, your team generally depends on manual configuration and updates on a regular basis ( well, the “IT manager” should have enforced configuration management in the first place, but, this being a critical situation, there is no much time available to think of it right now).
You now start thinking of quick solutions for dealing with this problem. Any seasoned Linux system administrator would think of writing a script. A script that would update the package on the list of servers that you specify.
That’s the only quick and obvious solution out there.
Alright let's get this moving. The script is going to be simple and straightforward. The script will have an argument for specifying a file. This file that is passed as an argument will have the list of IP addresses (of all the servers in your environment from which you are going to remove the fired employee’s user).
Ok, so we now have some idea to move forward. A script, with an argument to a file, containing the list of IP addresses. Fine, but how to do the real execution?.
You do not have any sort of agent on all these servers that will accept your command and execute it to remove the specified user. Any solution that we are going to use has to be secure as well.
You might have already guessed it. The only thing that’s running on all the servers without a doubt is SSH.
We can use SSH to login and execute the command to remove the user on all the servers listed in the file. Something like the below should work.
ssh -i privatekey.pem admin@$serverip "sudo userdel -r -f rogueuser"
"serverip" is a variable in the above command, which will be replaced for every IP found in the file that we will pass as an argument. "rogueuser" is an imaginary username that we are trying to remove, and "admin" is the username using which we will be logging in to all the servers.
So we solved two problems.
- Agent Problem in our Solution. We do not need any sort of agent to execute our command on all the servers, because SSH is by default available on all the servers
- Security Problem in our Solution. SSH is secure, it works on Public key cryptography, and is trustworthy similar to SSL protocol used by configuration management tools like Chef and Puppet.
We still have one main problem left to sort out. The bash script solution that we are trying to implement will have to iterate through the list of ip addresses specified in the file. The problem is this..
If you write a bash for loop (see an example below), it will execute the command on servers one by one. This is not a problem as such for our example, but if the number of servers are many, then it might take quite sometime to finish the process (as the command will be executed on one server at a time, and then proceed to the other).
The below script can be executed by ./scriptname serverlist.txt (where serverlist.txt contains the list of servers, line by line - basically one ip/hostname per line)
#!/bin/bash for i in $(cat $1); do echo "Removing User from $i" ssh -i privatekey.pem admin@$i "sudo userdel -r -f rogueuser" done
Of course there are solutions for solving this problem as well. You can use something like GNU Parallel, to execute the command in parallel on all the servers listed in the file(Execute some command on all servers at the same time).
Related: Running Commands In Parallel on Linux
Rather than using a bash script, the thought of implementing this script in Python strikes your mind. Which can then be used for any such operation in the future, and can also be a pet project which can be improved later on with additional features. Python being a relatively easy language to pick, some of your colleagues also agrees on the idea.
After a lot of googling and tests, you and your colleagues came up with the below in Python.
from multiprocessing import process as mp import paramiko key="key.pem" cmd="sudo userdel -r -f rogueuser" def get_list_of_servers(filename): with open(filename, 'r') as f: lines = f.read().splitlines() return lines def login_and_exec(host, command, key): ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect(host, username="admin", key_filename=key) stdin,stdout,stderr=ssh.exec_command(command) outlines=stdout.read() print outlines list_s = get_list_of_servers("listofservers.txt") procs = [mp.Process(target=login_and_exec, args=(i,cmd,key)) for i in list_s] for p in procs: p.start()
The first line imports the module called "multiprocessing", which will give our python script the capability to run commands in parallel (so irrespective of whether you have 500 servers or 50 servers, the time taken to execute your command on all of them will be same).
The second line imports the module called "paramiko", which will give our script the capability to ssh. It's a widely accepted python module to deal with SSH protocol.
As this is the first version of the script, you and your colleagues have hardcoded ssh private key file, and command to execute, along with the filename containing list of servers (see line number 4, 5, and 20).
Line number 7 and 12 defines functions. The first one for getting list of servers from the file, and second function is to connect and execute the command on the servers.
Line number 21 creates a list of processes for all the servers in the file. And line number 23 starts all those processes.
Actually I forgot to mention one thing. You and your colleagues have come up with a very rudimentary basic first version of Ansible. Voila. Yes, that's right. This is how ansible executes things across the list of servers. Of course it has lot of features, modules and functionalities. But under the hood, ansible does the job exactly the same way the python script we saw does. If you do not believe it, you can simply clone the ansible git repo and see the very initial commit (as shown below.)
#git clone https://github.com/ansible/ansible.git #cd ansible #git reset --hard f31421576b00f0b167cdbe61217c31c21a41ac02
f31421576b00f0b167cdbe61217c31c21a41ac02 is the initial commit of Ansible code base(Thanks to Michael DeHaan, who started it :) ). Which is quite easy to understand compared to the current code base (as it has grown to become one of highly used configuration management tool)
Ansible works in push based model. Which is quite clear now, from the python script we saw. It initiates the connection to all servers, and executes the command. Puppet and chef has agents on all servers (not SSH), which uses SSL protocol to connect to a master and pull the configuration. Due to which, both Puppet and Chef introduces the complexity of configuring agent on all servers.
In the above diagram, modules are nothing but self sufficient features that the user can leverage to execute things. For example, unlike our script, you will never say "yum install httpd" using ansible. But you will say the below.
- name: install the latest httpd package: name: httpd state: latest
Ansible will then intelligently use yum or apt-get that is appropriate for a host, to install the package named httpd. Module is nothing but the Python code that will let you write things like the above shown example.
With the package module available, ansible will know what to do, when you execute the above shown example. Similar to Puppet and Chef ansible uses something called as facter to get the full details of a host (details like os version, Linux kernel version, cpu, memory, ip address, hostname, and many more).
- user: name: rogueuser state: absent remove: yes
Similarly, when you execute the above, ansible will use the user module to remove the rogueuser from the system. It will use the user module to understand what action to take when seeing those parameters like name, state and remove.
Btw, did you notice something?. The file syntax that we are using above is YAML. So ansible understands YAML file syntax, and uses it for almost everything.
The modules like Package, User that ansible uses will all follow idempotency. What does that mean?. Well it is the main principle of configuration management. That means, even if you execute the same yaml file above multiple times, the result should be the same(basically it should verify if the user named rogueuser is present or not, if not then do nothing, if yes then remove it).
Similarly, if the package is installed, then do nothing. If not, then install it (execute multiple times, and get identical result). You simply specify the end state that you want using ansible. The module will then make sure that the end state is achieved.
We now know the fact that ansible works on a push based model, meaning no agent on the server is sitting and pulling anything from a central server like Puppet or Chef. Whenever we like, we simply tell ansible to execute something on the list of servers provided via a file (similar to our example rudimentary script :) ).
You can execute ansible from a server using shell, or using tools like Jenkins, or use the recently open sourced Tower (called as AWX).
Puppet has Modules, Chef has Cookbooks, Ansible has Playbooks.
What is a Playbook?. It is nothing but a collection of snippets that we saw previously.... The package, and user YAML example snippets of ansible that we saw. A playbook is nothing but a collection of such YAML files, that does something on a server. For example, install and configure something. Basically a series of resources (like, package, file, copy, user etc) that will achieve our end result in correct order.
So generally what happens is, that you execute a list of playbooks on a server. Playbooks will have self sufficient code to achieve one particular thing. For example, a playbook for MySQL will install it and confire it.
Comments
ansible
Hi,
Thanks for sharing your knowledge.
Can we enable ansible for host communction from one network to differnet network
Like my ansible server 10.1.x.x and remote host is 192.168.1.x network
Currenty we are doing manual process for deployment
1. pushing package from 10.1.x.x network to Gateway server machine
2. From gateway server machine to 192.168.1.x network
Please note that Due to network policy we can not able to communicate from 10.x.x.x machine to 192.x.x.x machine.
Question:
Instead of manual process, Can we accoomplish with ansible about network setup (Different network setup)
Manual Process:
Login to tomcat machine
stop the tomcat service
take the backup of webapps
copy the war file from build machine (10.x..x.x) to tomcat instance ( 192.x.x.x)
extract the war
copy the resource file
start the tomcat instance
Regards
Saravanan ND
Add new comment