Traditional application deployment was based on a single server approach. We had one application installed per physical server wasting server resources, components such as RAM and CPU never fully utilized. There was also considerable vendor lock-in, making it hard to move applications from one hardware vendor to another. Then the world of hypervisor-based virtualization was introduced and the concept of virtual machine (VM) was born.
We still deployed physical servers but introduced hypervisors on the physical host, enabling the installation of multiple VM’s on a single server. Each VM is isolated with its own operating system. Hypervisor based virtualization introduced better resource pooling as one physical server could now be divided into multiple VMs, each VM hosting a different type of application. This was leap years better than single server deployments. The VM deployment approach increased agility and scalability as applications within a VM are scaled by simply spinning up more VMs on any physical host. While hypervisor-based virtualization was a step in the right direction, a guest operating system for each application is pretty intensive. Each VM still requires its own RAM, CPU, storage and an entire guest OS. All of which consume resources. We needed a lightweight tool without losing the scalability and agility benefits of the VM based application approach. The lightweight tool is container-based virtualization and Docker acts at the forefront.
Container offers the similar capability to that of object-oriented programming. They let you build composable modular building blocks, making it easier to design distributed systems.
The application landscape has changed from a monolithic design to a design consisting of microservices. Today applications are constantly developed. Patches usually patch only certain parts of the application and the entire application is built from loosely coupled components opposed to existing tightly coupled components. The entire application stack is broken into components and spread over multiple servers and locations, all requiring cross communication. For example, users connect to a presentation layer, the presentation layer then connects to some kind of shopping cart and the shopping cart connects to a catalog library. All of these components are potentially stored on different servers, maybe different data centers. The application is built from a number of small parts and these small parts are known as microservices. Each component or microservice can now be put into a lightweight container – a scaled down VM. Now, we have complex distributed software stacks based on microservices. Its base consists of loosely coupled components that may change and software that runs on a variety of hardware, including test machines, in-house clusters, cloud deployments etc. The web front ends may include Ruby + Rail, API endpoints with Python 2.7, stack website with Nginx and a variety of databases. We have a very complex stack on top of a variety of hardware devices.
While the traditional monolithic application is likely to remain for some time, containers still exhibit the use case to modernize the operational model for traditional stacks. Both monolithic and container-based applications can live together.
The complexity of the application and the requirements of scalability and agility has led us to the market of container-based virtualization. Container based virtualization uses the kernel on the host’s operating system to run multiple guest instances. Now, we can run multiple guest instances (containers) and each container will have its own root file system, process and network stack. Containers allow you to package up an application with all its parts in an isolated environment. It is a complete abstraction and there is no need to run dependencies on the hosts. Docker, a type of container (first based on LinuX Containers but now powered by runC), separates the application from infrastructure using container technologies. Similar to how VMs separates the operating system from bare metal. It lets you build a layer of isolation in software that reduces the burden of human communication and certain workflows. A good way to understand containers is to accept that they are not VM’s – they are simple wrappers around a single Unix process. Containers contain everything they need to run (runtime, code, libraries etc).
Isolation or variants of isolation have been around for a while. We have mount namespacing in 2.4 kernel and user space namespacing in 3.8. These technologies allow the kernel to create partitions and isolate PIDs. Linux containers (Lxc) started in 2008 and Docker was introduced in Jan 2013 with a public release of 1.0 in 2014. We are now at version 1.9, which has some new networking enhancements. Docker makes use of Linux kernel namespaces and control groups, providing isolated workspace. Namespaces offer an isolated workspace that we call a container. They help us fool the container. We have PID for process isolation, MOUNT for storage isolation and NET for network level isolation.
Containers use schedulers. The task of a scheduler is to start containers on the correct host and then connect them together. It needs to manage container fail over and handle container scalability when there is too much data to process for a single instance. Popular container schedulers include Docker Swarm, Apache Mesos, and Google Kubernetes. The selection of the correct host depends on the type of scheduler used. For example, Docker Swarm will have three strategies – spread, binpack and random. Spread means node selection is based on fewest containers, disregarding their states. Binpack selection is based on hosts with minimum resources available i.e most packed. Random strategy selection are chosen randomly.
Starting a container is much faster than starting a VM – lightweight containers can be started as low as 300ms. Initial tests on Docker revealed that a newly created container from an existing image takes up only 12 kilobytes of disk space. A VM could take up thousands of megabytes. The container is so lightweight as they are just reference points to a layered filesystem image. Container deployment is also very fast and network efficient. Less data needs to travel across the network and storage fabrics. Elastic applications that have frequent state changes can be built more efficiently. Both, Docker, and Linux containers fundamentally change application consumption.
As a side note, not all workloads are suitable for containers and to support multi-cloud environments heavy loads like databases are put into VM’s.
Networking is very different in Docker than what we are used to. Networks are domains that interconnect sets of containers. So, if you give access to a network, you can access to all containers in that network. If you want external access to other networks or containers you have to specify with rules and port mapping. Every network is backed by a driver, be it a bridge and overlay driver. These docker based drivers can be swapped out with any ecosystem driver. The team at docker view them as pluggable batteries. Docker utilizes the concept of scope – Local (default) and Global. The local scope is a local network and global scope has visibility across the entire cluster. If your driver is a global scope driver, then your network belongs to a global scope. Local scope driver corresponds to local scope.
Docker 0 is the default bridge. They have now extended into bundles of multiple networks and each of them has their own independent bridges. Different bridges cannot directly talk to each other. It is private isolated network offering micro-segmentation and multi-tenancy features. The only way for them to communicate to each other is via host namespace and port mapping that is administratively controlled. Docker multi-host networking is a new feature in 1.9. A multi-host network comprises of a number of docker hosts that form a cluster. There are a number of containers in each host and they form the cluster by pointing to the same KV (example -zookeeper) store. The KV store that you point to is the one that defines your cluster. Multi-host networking enables the creation of different topologies and lets the container belong to a number of networks. The KV store may also be another container, allowing you to stay in a 100% container world.