Enabling Portable and Secure Computing Environments for High-Performance Workloads.
As part of their ongoing efforts to streamline workflows, enhance productivity, and save time, engineers, and developers in enterprises and high performance computing (HPC) focused organizations alike often turn to container runtimes. Containerization is helpful because it enables teams to bundle software into a self-contained unit that can function consistently across multiple computing environments. This blog post provides a quick primer on the history of containers and introduces Singularity containers, a powerful tool for deploying performance-intensive workloads.
Why do Containers Exist?
Before diving into Singularity containers, it’s useful to understand the concept of containers in general and a bit of their history.
Containers are lightweight, portable, and self-contained environments that can run applications and their dependencies. They are similar to virtual machines (VMs) in that they both provide isolated environments, but containers are more lightweight and efficient because they share the host operating system kernel.
Containers exist to solve the problem of software dependencies. When developing and running applications, multiple software packages are often used, each with its own set of dependencies. Installing and managing these dependencies can be time-consuming and error-prone, especially when working on different computing environments. Containers allow users to package all the necessary software and dependencies into a single unit that can be easily transported and run on any computing environment.
The origins of containers can be traced back to 1979 with the introduction of the UNIX chroot concept. Specifically, a system call in the UNIX V7 operating system allowed a process and its child processes to change their root directory to a new location in the file system that was only visible to that process. This feature created a separate and isolated disk space for each process. It was later incorporated into BSD in 1982.
In the early 2000s, various hosting providers developed a shared environment using FreeBSD jails to separate their own services and customers for security and ease of administration. The concept of “jails” refers to a system administrator’s ability to partition and install an operating system into smaller, independent systems that can then be configured with their own IP addresses and unique settings.
Around 2005, Sun Microsystems released Solaris Zones, which combined system resource controls and boundary separation provided by zones. They could leverage features like snapshots and data cloning.
In 2006, Google introduced Process Containers, which were designed to limit and isolate resource usage (CPU, memory, etc.) for a set of processes. They were later renamed “Control Groups (cgroups)” and eventually merged into the Linux kernel 2.6.24.
In 2008, IBM engineers created LXC (LinuX Containers), providing the first implementation of the Linux container manager that leveraged cgroups and Linux namespaces, and worked on Linux without requiring any patches.
The natural evolution of these tools and the need to both manage containers and maintain separate instances led to the release of Docker in 2013, which ultimately became the de facto container runtime for enterprise applications.
In 2017, container technology gained significant momentum as leading cloud and container platform companies focused on endorsing Kubernetes, an open-source container scheduler and orchestration tool. This move solidified Kubernetes’ position as the go-to container orchestration technology.
Despite all of the progress and evolution in the container space, however, there was still a gap in addressing the challenges of sharing and running scientific workloads in HPC environments.
Introducing Singularity Containers
Researchers at Lawrence Berkeley National Laboratory created Singularity containers as an open-source project in 2015. Their goal was to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world, addressing the challenges of sharing and running scientific workloads across different computing environments and multi-tenant systems. Singularity containers are similar to Docker containers but have some key differences that make them better suited for scientific research.
Understanding Singularity vs. Docker
One of the key differences between Singularity and Docker is that Singularity containers are designed to run as a non-privileged user. This is important for security reasons, as running containers as root can potentially give attackers access to the host system. Singularity offers several benefits over Docker:
- Verifiable reproducibility and security. Singularity uses cryptographic signatures, an immutable container image format known as the Singularity Image Format, and in-memory decryption.
- Integration over isolation by default. Easily make use of GPUs, high-speed networks, and parallel file systems on a cluster or server.
- Mobility of compute. The single file SIF container format is easy to transport and share.
- A simple, effective security model. You are the same user inside a container as outside and cannot gain additional privileges on the host system by default.
Singularity Image Format (SIF)
The Singularity Image Format (SIF) is a container image format that is specifically designed for Singularity containers. It allows users to easily share Singularity container files and distribute them across different computing environments. In essence, SIF is a single file and specification that supports the bundling of metadata and enables image signing and verification, and image encryption. Sylabs is also planning to introduce support for the encapsulation of OCI data and configurations in SIF.
Use Cases for Singularity Containers
Users rely on Singularity for a wide range of use cases, including the following.
High performance computing
Singularity is used extensively in HPC environments to provide a consistent and portable software environment across different clusters and architectures. Singularity containers natively support resource managers, InfiniBand, MPI, GPUs, parallel file systems, and more on HPC clusters, ensuring that the application and its dependencies are properly configured.
AI practitioners use Singularity for a variety of use cases, including computer vision, natural language processing, and machine learning. By providing a way to package GPU-enabled AI models and dependencies, Singularity makes it easy to deploy them on different platforms and architectures. This can help simplify the process of deploying models in production.
Singularity containers are used in bioinformatics to facilitate collaboration and reproducibility of analyses, as well as to overcome challenges such as dependency management, software installation, and cross-platform compatibility. For example, users can rely on Singularity containers to run bioinformatics tools such as BLAST, Bowtie2, BWA, GATK, etc. on different platforms while reducing issues associated with complex overlapping dependencies.
Researchers rely on quantum circuit simulation to create simulators that mimic the quantum phenomena of superposition and entanglement. These simulators help researchers from various domains and sectors develop and test algorithms for quantum computers. Researchers use the reproducibility, portability, and isolation features in Singularity containers to accelerate quantum simulation and workloads.
Electronic Design Automation (EDA)
Singularity containers can help decouple EDA toolchains from operating systems. EDA tools often require specific operating system versions and libraries, which can limit their compatibility and usability across different platforms. Singularity can encapsulate these tools and their dependencies in a portable virtual environment that can run on any Linux system.
Streamline Your HPC Workflows with Singularity and SIF
Singularity and the SIF file format offer a powerful opportunity to streamline your workflows, increase productivity, and collaborate more effectively with other teams while providing verifiable reproducibility and security. Whether you’re just getting started with containers or looking for a more advanced tool to help you take your work to the next level, Singularity is worth exploring. At Sylabs, we offer multiple versions of Singularity, including SingularityCE, SingularityPRO, and Singularity Enterprise, that are designed to support the gamut of organizational sizes and needs.
If you’re interested in getting started with Singularity containers, check out SingularityCE
, the free and open-source version of the Singularity runtime.