Singularity Containers for HPC & Deep Learning
#Containerization as a concept of isolating application processes while sharing the same operating system (OS) kernel has been around since the beginning of this century. It started its journey from as early as Jails from the @FreeBSD era. Jails heavily leveraged the chroot environment but expanded capabilities to include a virtualized path to other system attributes such as storage, interconnects and users. #SolarisZones and #AIX Workload Partitions also fall into a similar category. Since then, the advent and advancement in technologies such as cgroups, systemd and user-namespaces greatly improved the security and isolation of containers when compared to their initial implementations. The next step was to abstract the complexity involved in containerization by adding user-friendly features such as templates, libraries and language bindings. LXC, an OS container meant to run multiple services, accomplished this task. In the late 2000s, @Docker, an application container that was meant to run a single service, took it one step further by creating an entire ecosystem of developer tools and registries for container images, improving the user experience of containers and making containerization technology accessible to novice Linux users. This led to widespread adoption. The evolving nature of this domain led to the creation of a standardization process called Open Container Initiative in 2015. Containerization drastically improves the “time to results” metric of many applications by eliminating several roadblocks in the path to production. These roadblocks arise from issues pertaining to portability, reproducibility, dependency hell, isolation, configurability and security. For example, applications in fast-paced growth areas, such as high performance computing (HPC) which include verticals and workloads like molecular dynamics, computational fluid dynamics, MIMD Lattice Computation codes, life sciences and deep learning (DL), are very complicated to build and run optimally, due to very frequent updates to the application codes, its dependencies and supporting libraries. Container images include the applications themselves and their development environments, which aids developers in porting their applications from their desktops to datacenters. This also helps with version control, making it easier to understand which versions of the software package and dependencies lead to a particular result. This, in turn, helps to manage dependency hell, which refers to the entire application failing if one of its dependencies fails. Another real-world example is collaboration between researchers, where containers help with reproducibility of the results. In short, instead of spending time debugging the software environment, researchers can focus on the science. Containers are orders of magnitude faster to spin up than virtual machines (VM) since they do not have a hypervisor providing a virtual hardware layer and a guest OS on top of the host OS as an independent entity. For a majority of HPC and DL applications, an un-tuned full VM may experience performance degradation when compared to bare metal. The overhead and performance degradation of a container when compared to an equivalent fully subscribed bare metal workload performance is nonexistent. All of these features have promoted the deep proliferation of container technologies.