Data protection for containers: Why, and how to do Docker backup
Containers have been around for many years, but the use of container technology has been popularised in the last five years by Docker. The Docker platform provides a framework to create, configure and launch applications in a much simpler way than in the native features of the Linux and Windows operating systems on which they run. An application is a set of binary files that run on top of an operating system. The application makes calls via the operating system to read and write data to persistent storage or to respond to requests from across the network. Over the past 15 years, the typical method of application deployment has been to run applications within a virtual machine (VM). VMs take effort to build and manage. They need patching and have to be upgraded. Virtual machines can attract licensing charges, such as operating system licences and application licences per VM, so have to be managed efficiently. Containers provide a much more lightweight way to run applications. Rather than dedicate an entire VM for each application, containers allow multiple applications to run on the same operating system instance, and these are isolated from each other by segregating the set of processes that make up each application. Containers were designed to run microservices, be short-lived and not require persistent storage. Data resiliency was meant to be handled by the application, but in practice, this has proved impractical. As a result, containers can now be easily launched with persistent storage volumes or made to work with other forms of shared storage. Container data protection A container is started from a container image that contains the binary files needed to run the application. At launch, time parameters can be passed to the container to configure components such as databases or network ports. This includes attaching persistent data volumes to the container or mapping file shares. In the world of virtual machines, the VM and the data are backed up. Backup of a virtual machine is for convenience and other potential uses. So, for example, if the VM is corrupted or individual files are deleted they can be recovered. Alternatively, the whole VM and its data can be brought back quickly. In practice though, with a well configured system, it may be quicker to rebuild the VM from a gold master and configure it using automation or scripts. With containers, rebuilding the application from code is even quicker, making it unnecessary to backup the container itself. In fact, because of the way containers are started by platforms such as Docker, the effort to recover a container backup would probably be much greater than simply restarting a new container image. The platform simply isn’t designed to recover pre-existing containers. So, while a running container instance doesn’t need to be backed up, the base image and configuration data does. Without this the application can’t be restarted. Equally, this applies to implementing a disaster recovery strategy. Restarting an application elsewhere (eg, in the public cloud or another datacentre) also needs access to the container image and runtime configuration. These components need to be highly available and replicated or accessible across locations.