In my last post, I mentioned that a good use case for containers is managing application deployments. I’m going to go a step further here and state that containers are the next evolution of application packaging and application run time separation.
Packaging in the time of Docker
In the beginning, an application was created from code dumped into file system locations by highly sophisticated mechanisms like copy and paste. The operating system had utilities to generate the executable binary and provide the operational environment. There was some separation of responsibility, while the application knew how to interact with the operating system, it was primarily the job of the operating system to handle things like devices. Perhaps, the application included some other reusable code, delivered as a library in the same sophisticate copy / paste means. Any changes to the operating system or application caused the need to rebuild everything.
The next advance in application packaging was “the archive”, long live tarballs! This made the exact transcription of an application’s code to another system easier. The creation of common executable formats also made it possible to move a binary built on one system to another (identical) system. The code still was generally carried along, in case there were some unaccounted for differences. Here starts the ideas of application packaging.
We can zip through the years to the “packaging wars” of Linux, Java, and other platforms and languages. What we see as a common thread is an increase in what needs to be packaged with “the application”: creating of ways to track dependencies, creation of frameworks to provide dependencies, etc. We see larger definitions of “the application”, since we now need not just the developer’s code, but the supporting libraries taken into account. Middleware frameworks are created to increase re-usability and reduce duplication while providing a way to run more applications concurrently. Build systems can track, compose, and deliver complete artifacts. We still have some dependencies on the operating environment, but those are providing common structures and highly desirable features like ABI guarantees within major versions.
As applications change, conflicts can arise in versions of dependencies. Patches applied to frameworks can have unexpected side effects for some (but not all) hosted applications. Our operating environments are software all the way down, so this is all inevitable. Docker containers expand the boundaries of “the application” one more ring. Using the model of a single application service in a container, we can create a PostgresDB application package that only depends on very specific, very common capabilities of an operating environment, for example running in a Fedora image on a RHEL 7 server. Fedora uses a much newer kernel than RHEL, but since we can include the needed system level libraries in the container, our PostgresDB will still run as expected. If we close the capabilities gap and run PostgresDB in a RHEL 7 container on RHEL 7 server, we can create a smaller container footprint as we don’t need to provide as many common resources.
So Docker (and really any other format) containers provide “universal” application packaging by expanding the definition of what is included in the box when we say “application”.
Separation in the time of Docker
The early applications were tied to specific systems for the simple reason that they were built there. But we could run many applications on systems since we had fancy time sharing means to do so. Even with portable binaries, it was common to run multiple applications and just understand that sometimes, your application ran slower as it competed for resources. As businesses came to rely on applications and performance mattered more, we began to separate applications on physical resources to guarantee performance. And almost as critically, to ensure that the poor behavior of one application didn’t have a negative impact on another. This is the original “noisy neighbor” begins the expansion of the server room, and leads to the underutilized bare metal deployments we’re know and love.
While it guarantees performance, physical separation isn’t the most cost effective solution. Underutilization costs real dollars in both acquisition and maintenance. Virtualization is our attempt at maintaining that application run time separation while solving the utilization problem. The ability to constrain a server by software to certain utilization levels allows us to bring noisy neighbors together with some sound proofing. Utilization starts going back up, and densities increase, making our money work harder in the IT space. In fact, virtualization works so well at reducing costs, we begin to over commit resources and virtual servers being spreading like kudzu.
The primary problem of virtual machine sprawl is one of management. When system resources were expensive, it was easier to manage the operating environment and applications of each individual system. We may need a few tools, but it’s reasonably simple. There may even be some built in redundancy to handle exactly these management issues. When the contents of the virtual system doesn’t change, but creating one becomes cheaper, the idea that we can effectively manage individual operating systems and applications begins to strain and eventually break. Enter the cloud.
The primary difference between virtualization and cloud computing is the treatment of the virtual resource. Cloud is based on the idea that virtual machines have become so cheap, it is a better use of administrator time to replace an operating system and / or application rather than attempt to patch and maintain it in situ. Docker is born from the same mental model of management, and a container provides a means of separation similar to a virtual machine. These are the days of the “immutable infrastructure” components.
Docker containers can restrict access to resources like memory and cpu on RHEL via cgroups. Containers provide unique views of the host to the contained application, like blinders restricting a horse. The Linux capability mechanisms can remove the ability for containers to do things that can be done on a full operating system, like mount a file system. Containers don’t have the full emulation path overhead that enterprise virtualization platforms work so hard to mitigate; there isn’t a virtual NIC, there’s a unique representation, by the kernel, of the networking stack. We don’t have a full boot environment either remember, just the expanded new boundaries of our application.
Putting it together operationally
What happens when you combine the expanded definition of what it means to package an application with the knobs and know-how to provide good separation between applications? You get an operating platform that allows for independent maintenance of extremely lightweight operating systems. You get complete control of application internals in the hands of the developers. You get a means to run applications in a wide range of environments. You get the ability to further leverage virtualization as a cost improvement, while gaining manageability by moving toward interchangeable parts.
While there are new tools and new work flows, the fundamental thing to understand is that we haven’t really changed the overall process. Developers still build and maintain applications, it just things they used in the past to build apps are now in the packaging instead of being operational runtime problems. Finding and installing appropriate versions of shared libraries isn’t an interesting problem to spend time and money on. Operations engineers still focus on delivering a stable, resilient environment for applications, but can do so without focusing on the care and feeding of each and every server. Applying patches to individual OS instances isn’t an interesting problem to spend time and money on.
And the more interesting problems you are solving, the more valuable you are to your organization.