Difference between Hypervisor Virtualization and Container Virtualization

Sarath Pillai's picture
containers vs hypervisors

Recently a new technology(well not a new technology actually, but a new way of implementation) got a lot of traction in the open source community and many major players in the industry adopted it as part of their upcoming releases.  Its none other than Docker. The traction gained by this open source product was so high, that it became the darling of packaging applications inside a small container, and has become the hottest trend in application development, deployment and testing in months of its initial release.

 

 Docker solves one of the main problem that system administrators and developers faced for years. Its this “It was working on dev and qa. But why the hell is it not working on production environment”. Well the problem most of the times can be a version mismatch of some library or few packages not being installed etc etc. This is where docker steps in, and solves this problem for ever, by making an image of an entire application, with all its dependencies and ship it to your required target environment / server. So in short, if the app worked in your local system, it should work anywhere in the world(because you are shipping the entire thing).

 

Well you might be thinking even hypervisor based virtualization can solve this problem of “was working in dev and qa but not in production”, by taking an image of an entire virtual host and launching a new virtual instance from it(the thing we do generally in aws, or openstack). Agreed, that can be done. But a container is so light weight that you do not have to go through the hassle of setting up a new host altogether just for your app. In fact it just takes few seconds to pull a container image from a registry and start it.  Well that’s the main advantage of using docker container virtualization. We will not be discussing a lot about Docker here, simply because it needs special attention and requires a series of posts to cover it.

 

We will be discussing the differences between a hypervisor based virtualization and a container based virtualization in this post.

 

Well the general term virtualization can be defined as follows…

 

Its nothing but a method or technique used to run an operating system on top of another operating system. So the hardware resources are fully utilized and will be shared by each of the operating system running on top of the base operating system .

 

The basic idea behind a hypervisor based virtualization is to emulate the underlying physical hardware and create virtual hardware(with your desired resources like processor and memory). And on top of these newly created virtual hardware an operating system is installed. So this type of virtualization is basically operating system agnostic. In other words, you can have a hypervisor running on a windows system create a virtual hardware and can have Linux installed on that virtual hardware, and vice versa.

 

So the main basic thing to understand about hypervisor based virtualization is that, everything is done based on a hardware level. Which means if the base operating system (the operating system on the physical server, which has hypervisor running), has to modify anything in the guest operating system(which is running on the virtual hardware created by the hypervisor), it can only modify the hardware resources, and nothing else.

 


Hypervisor based virtualization

 

A Hypervisor is also called as a virtual machine Monitor(VMM), This is because the hypervisor sits in between the guest operating system and the real physical hardware. Hypervisor controls the resource allocation to the guest operating system running on top of the physical hardware.  

 

Now the main added advantage of virtualization is full utilization of hardware resources(which is costly). Now imagine a situation, where you have a physical server with 10G RAM, 8 core processor, and a 1G NIC card. And you are using that particular server for your organizations internal website and a ftp server for your employees, to share files. Now the server might be idle for most of the times, because there will be no heavy usage of that internal website and FTP server. The entire hardware resources dedicated to it, remains idle most of the times, and is a waste of computer resources. 

Using virtualization, you can easily make multiple virtual machines inside, and allocate each of them only the required amount of hardware resources(because your website and FTP might not require more than 1G memory), the remaining virtual machines can be used for other purposes. 

 

Even in virtualization there are two different types of Virtualization mainly used in the industry. One is Hosted virtualization and the other is Bare Metal Virtualization. Hosted Virtualization is what we just discussed(ie. A software hypervisor installed inside the base operating system, which will intern do the resource allocation and monitoring). Examples includes VMware Workstation, Microsoft's Virtual PC, 
 
Although Hosted virtualization is cheaper, there are limitations in it, especially in terms of performance. This performance impact happens majorly due to the fact that there are multiple memory and cpu managers for a guest operating system. First there is a memory and cpu manager that is part of the base operating system(on which our hypervisor sits), then the hypervisor itself has a cpu and memory manager, hence this multiple managers can at times cause real overhead, which can result to a higher performance impact. 
 
 
Bare metal virtualization has only one major difference compared to Hosted virtualization. The difference is that the Hypervisor sits directly on top of the hardware. The hardware device drivers are part of the hypervisor, and there will be only one memory and CPU manager(that is part of the hypervisor which sitting directly on top of the hardware. This is the reason its called as bare metal virtualization)
 
VMware ESX and Citrix Xen Servers are good examples of Bare Metal virtualization. 
 

 

What is Container Virtualization?

 

While discussing about hosted and bare metal virtualization, one common thing that we found was that both of them are based on a hardware level(basically they are virtualizing hardware resources). But container virtualization is done at the operating system level, rather than the hardware level.  The main thing that needs to be understood about container virtualization is...

 

Each container(well call it guest operating system) shares the same kernel of the base system.

 

Container Based Virtualization

Now you can guess the main advantage of container based virtualization over hosted and bare metal, which is quite obvious. ie. As each containers are sitting on top of the same kernel, and sharing most of the base operating system, containers are much smaller and light weight compared to a virtualized guest operating system. As they are light weight an operating system can have many containers running on top of it, compared to the limited number of guest operating system that you can run. 

 

So before going ahead with understanding container based approach, we need to understand why there is a need. The real need which triggered virtualization was the below things. 
 
  • Isolation of application environments, 
  • Resource isolation,
  • All of these without impacting performance,
  • Sharing a common thing between virtualized hosts, or containers must be easy. 

 

Now although hypervisor based approach to virtualization does provide a complete isolation for your applications, it has a huge overhead(overhead of allocating resources, overhead of managing the size of a virtual machine). Another complexity is sharing. Sharing in a virtualized environment(between guest operating system) is very much similar to sharing between independent systems, because virtualized hosts are not aware of each other, and the only method of sharing is traditional method of network of shared file system etc. 
 
 
As the container is sharing the kernel with the base system, you can see the processes that are running inside the container from the base system. However when you are inside the container, you will only be able to see its own processes. 

 

 
Now the basic principle to understand about a container is that, without a virtual hardware emulation, containers can provide a separated environment, similar to virtualization, where every container can run their own operating system by sharing the one single kernel. But each container has their own network stack, file system etc.
 
This isolation of each containers is provided by a major Linux kernel feature called as cgroups and namespace. 
 
 
Now if you think logically what we need apart from the shared kernel in the container based virtualization is Different views of the system for different containers(means somehow they must be isolated), Now there is a feature in Linux kernel which creates new namespace for each containers. Now what namespaces is in simple layman terms is that it enables applications to run in their own isolated environments with separate process id lists, network devices, file system, user lists etc. 
 
So what it does is, if you see from outside(from the base system), you will see a container as a process, but inside the container you will get the feeling as if you are inside a virtual machine, completely separate from the entire system. This is because you are in an entirely new namespace of its own.
 
In other words, you can have the same PID, for example 100 inside your container 1, container 2 and container N. This is because each of the container resides in its own PID namespace. 
 
The initial state of a Linux system is "One single namespace for everything(Networks, PID's, Devices etc.)", for containers what we are doing is to slice it to different namespace for each containers to run inside isolated
 
To understand namespace further, take example of a directory. You cannot have the same file name for different files in a single directory, however you can have the same filename inside different directories. Similarly process id's, users with same number and name can be present in different containers(as they are in different namespace). Few namespace's that you can have in Linux is as below. 
 
 
  • PID (Isolated process id's)
  • Net (Network interfaces)
  • mnt (Mounting file system)
  • Hostname separation
  • Users.

 

If you are a system administrator, then am sure you might have already heard of a term called as "chroot". Its a method to make a root files system where your process will run isolated from others. In simple terms, its done so that the process will be unable to access any other data on the system in any way(so an added level of security on a file system level). Am sure you might have done chrooting with Apache, or any web servers for that matter for security reasons(which limits access to the system. Because the root of the process becomes your given directory, hence no access to the real root directory). Container based virtualization takes this idea and expands it to each and every resources that a process/application requires(mentioned above), this is done to such an extent that you can have multiple operating system running with different applications on top, all inside the same kernel.

 

Now what more do you need to have an isolated environment with different namespace for different applications? The above list will cover almost all application environment requirements. 

 

Now there must be something that will take care of resource allocations to these containers. This very thing is done by Cgroups in Linux Kernel. There is an excellent Red Hat Doc explaining Cgroups

The original idea of cgroups was originated from Google Engineers, who had to limit resource utilization(CPU, memory etc) for different process groups..

 

Related: Understanding Processes in Linux

 

By default on a Linux system, all processes are children of the INIT process. Which means all processes are part of a single tree structure. Now in Cgroup method, different process groups, can exist on a single system. So instead of a single process tree of default linux method, cgroup method can have different trees of process structure(with different parents, and childs will inherit stuff from their parents), all isolated from each other. Now you might have got an idea of how cgroups and namespace are leveraged by container based virtualization. 

 

Related: Linux Booting process Explained

 

Examples of Container based virtualization includes LXC, OpenVz, Parallels Virtuozzo, Solaris Containers, Docker(Uses LXC), HP-UX containers etc.

Well you might be thinking why all this hype to docker if container based virtualization was there from a long time now. This is because of the excellent API it brings along with it, to manage containers. 

 
  • The main advantage of using docker for container based virtualization is because of its ability to easily build, and ship containers. Yeah ship your applications to anybody or a remote server with ease. The entire container is shipped(so it gives you 100 percent guarantee that whatever worked on your local environment will work on your target environment.)
  • You get a GIT like version controlling feature for your containers. Yeah that's correct, make changes to your applications and commit it and upload it to your repository or ship the latest version.
  • An excellent community shared repository of containers, where you can get ready made containers with your required open source application on top of it. So its simple as installing a package using apt-get or yum. So its easy and faster to pull containers from registries(like yum repositories, for packages) and get running in minutes. 
  • Reuse: Any existing container built by others can be used to modify and make your own version of the container. Has concepts like base image, on which you can have your own stuff/configuration for your custom application. 
 
These are some of the points i could remember, i mean those are the high order bits. The reason i wrote this blog post, is because i will be beginning a tutorial series for docker containers.

 

Rate this article: 
Average: 4 (452 votes)

Comments

Are sun virtual box and oracle VM virtual box examples of hypervisors?

Thanks for shedding light on something which has lot of ambiguity..

This tutorial gave me a good understanding on the basics of virtualization and containers. Thanks!

REALLY NICE article - clear, logical and easy to understand; with related buzzwords explained.THANKS for all the effort!!!

After reading this article, I know the differences now between Virtualization and Containers. Thanks for sharing this knowledge. Would expect more on examples also :)

That well explained the distinction between virtual machine and virtual container. Thanks

Thank you very much for such great article. This is what I want to look. Very simple and well explained, I really love your awesome work. This save me a lot of time when I searched for figuring out what are differences between docker and vagrant (and other visualization engines). That is what I'm here.

So thank you again, keep doing your great works.

Thank you so much!

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.