Each developer, that works on many non-trivial projects, struggles with maintenance of development environment, that has to be different for each of his projects.
For example: We have three different projects. One of it is outdated, and almost unmaintainable spaghetti code example. It depends on files, which locations are hardcoded in code, and are distributed along whole file system. To make things even worse, it depends on insecure and unmaintainable PHP 5.2, and depends on few rarely used PHP extensions. Second project, is a fresh meat. It depends on newest PHP 5.6, and has some non-trivial business logic in database, and uses PostrgresSQL as a db backend. Third project uses Drupal, and we need MySQL database, memcache server as cache backend and PHP in 5.4 version.
How to switch flawlessly between those projects? How to avoid handling dependency, configuration and memorization all steps required to start each project? What if require those environments only for short period time, for example when we only trying to debug some kind of critical bug?
Quite natural solution is a virtualization. Thanks to VirtualBox project you can create separate, independent virtual machines for each from those projects, and celebrate solving such problem. Each major OS has few popular platforms available: Virtualbox, VMWare, Ithings users have Parallel, Microsoft distributes VirtualPC
Unfortunately, those solutions aren't perfect - performance is much worse than bare metal, especially in IO performance
What can we do with that problem? There are few alternatives, and in this entry I'll focus on Docker, performant and lightweight solution, that uses para-virtualization, with barely minimal overhead on performance.
What is this "Docker" thing?
Docker is a tool that is using mechanism of process isolation, that is delivered by Linux kernel (Linux Containers - - lxc) and adds a high level interface to that low level tool. This interface lets you to run and distribute application in easy way on many different machines.
It might sound complicated, but it isn't :) As usually in Linux based OS, it's CLI interface, if you are click addict, it's not for you :)
Ok, but what makes this tool different?
It uses linux containers in order to run processes independently from base operation system (in sandbox). Newer versions of Docker (> 0.9) can use different backends, but I haven't verified those alternative solutions
Containers - in big simplification, it's mechanism of boxing application with it dependencies as single package, that can be distributed between different machines. For example, container can include base operation system, LAMP stack and all files required to run our lovely Drupal.
It uses file system (AUFS) that supportCopy-on-Write feature. It means, that when you create new containers, we don't copy all fs structure, but only those parts that will be changed. Docker checks, that additional disk resources are used in lazy way (only when it's required). Thanks to that, creation of new container is a matter of seconds
You can link containers with each other - it means, that using one container (for example with Drupal distribution) as a base, you can create second one, that will be mostly identical as its base, but with small amount of modifications
Quick shared directories. My experience with Virtualbox was painful, performance of shared disk resources were very bad.
Important thing is a fact, that container is not a virtual machine counterpart. It's only a box with files, needed for execution of application. When we are executing a container, linux kernel separates application executed in that container, thanks that it can work with completely different configuration than our host machine system. Only cost, that we pay is cost of resources required by our running application and some additional disk resources used by libraries. What is important we don't execute second virtual machine and full operating system - only application and its direct dependencies
It means, that for example for Drupal, our all resources is executed WWW server and database, runned in container. In each moment, we can stop container, and even completely purge it from our filesystem. This approach has some limitations - in containers we are executing only Linux stack, for other operating systems we need full virtualization.
In my daily routine, such soltion works perfectly, and successfully replaced Vagrant + Virtualbox peer. Problems I meet with those two tools passed away to bad past - for my use cases is much better
If we assume, that we are using one of popular Linux distribution, installation is very easy. Docker maintainers provide script, that takes care for configuration and installation of all dependencies.
$ sudo -s
$ apt-get install wget
$ wget -qO- https://get.docker.com/ | sh
After execution of previous commands, script will add repository to your system repositories list and it will download and install newest stable version of docker. Basic user interface is command docker. In order to execute it without being logged in as root user, you have to add your working account to docker group. To do so (under ubuntu) do:
$ sudo usermod -aG docker [nazwa of your account]
And it ends process of Docker installation and configuration.
Unfortunately, installation on non Linux operation systems is more complicated, and it's out of scope of this blog post. If you are interested, please check docker project documentation.
This blog entry included only very overall description of idea behind Docker. In next parts of this cycle I'll show how to use this tool in practice, and other aspects of Docker.