Start a Conversation

Unsolved

R

1 Rookie

 • 

15 Posts

1697

December 5th, 2022 12:00

What is Container Storage Interface (CSI) and how does Dell use it?

The container storage interface (CSI) is a specification aimed at providing an industry standard for how storage providers can interoperate with container orchestration (CO) systems. 

So, what does this mean? 
 
Before we dive in, a very brief bit of history will help paint the whole picture. We are going to start in 2013 even though the concept of Linux containers for isolation was around for some time prior. During this time, the use of Linux containers was popularized by a developer-oriented tool called Docker. The core value proposition for Docker was that it made it simple for developers to isolate their application dependencies, build their applications into lightweight Linux container images and distribute them. This, along with a shift toward DevOps, simplified the development process enough that organizations could take advantage of a streamlined process to get their applications into production. Shortly after this, container orchestration systems entered. Apache Mesos, Mesosphere and Kubernetes were the dominant form factor between the mid-to-late 2010’s, however it was Kubernetes that has come out on top. Why does this matter? Well, containers were originally thought of as applications that remained stateless or only kept data in tmpfs (temporary filesystems) and volatile memory areas and did not need to persist data beyond the container lifecycle. 

ryanwallner_1-1668807383052.png

The reality was as the adoption grew around container orchestration systems like Kubernetes, the more people started to realize it as the way we would run all types of applications, not just stateless. This meant stateful applications quickly started to enter as first-class applications running in containers. The challenge now was how to go about providing storage resources to containers so that these applications could persist data beyond container lifecycle events like start, stop, failure and rescheduling. There were several approaches to providing storage to containers prior to the creation and adoption of CSI.  

Runtime Volumes: Runtime volumes, or a better-known implementation of them called Docker Volumes is a data location outside of container management by the container runtime itself. This means data does survive container lifecycle events but really was not meant to be used in enterprise workflows. This means they were not designed for substantial amounts of data, and they did not have features like failover to other nodes, replication, deduplication, encryption and more. 

Host Path / Bind Mounts: Bind mounts were often used as a straightforward way to provide persistence because it enabled users to provide a location on the host system that could store much more data. Bind mounts were also easy to understand and use while existing storage locations such as an attached and mounted iSCSI LUN could be bound into a container. This was great but also came with limitations, as container orchestrators maintained the lifecycle of containers and often rescheduled them for availability, data would be left behind on its original host, exposed, vulnerable and a bit useless while the rescheduled container sat in an infinite crash loop because it could not find its data. 

Storage Plugins: Storage plugins were developed to overcome some of the issues found in the above approaches for host path and runtime volumes. Storage plugins allowed for software automation to manage the creation, deletion, attachment and mounting of volumes onto hosts where containers ran. This allowed plugins to integrate with the container orchestration system of choice to automatically manage storage via a set of APIs or SDK. The issue was, each orchestration system (Docker, Docker Swarm, Mesos, Kubernetes) had its own way of doing so. For Kubernetes, these were called In-tree plugins where vendors contributed code upstream to the main Kubernetes repository. This led to a bloated codebase that was hard to maintain and vendors could not maintain their own release schedules for their plugins.  

Enter Container Storage Interface 

With CSI’s roots of origin deep in Dell’s history, the community around Kubernetes decided it was time for a universal specification that aimed at providing interoperability to container orchestration systems and the storage plugin providers. CSI currently aims at providing standards for dynamic provisioning and deprovisioning of a volume, attaching or detaching a volume from a host, mounting, and unmounting a volume from a host, consumption of both block and mountable volumes, creating and deleting a snapshot and provisioning a new volume from a snapshot. CSI benefits multiple container orchestrators; however, we saw the following benefits to the Kubernetes ecosystem: 

  • Provides a standard specification for how all container orchestrators could use storage. 
  • Decoupling of Kubernetes code-base and releases from plugins, which allows for independent release cycles and support from plugins providers. 
  • Allows Kubernetes ecosystem to focus on first class Kubernetes components and an API for storage, not the plugins themselves 
  • Issues and bugs related to storage can be fixed out-of-band from upstream Kubernetes codebase. 
  • Improve the overall security posture of the upstream Kubernetes codebase 
  • Lower barriers for implementing new CSI compatible drivers 

In fact, this has had such an impact that Kubernetes has an official GA effort to migrate off all in-tree plugins to CSI as of Kubernetes 1.25. 

ryanwallner_1-1668807037619.png

How does Dell Technologies use CSI? 

Dell Technologies is no stranger to CSI nor implementing it across its portfolio. A quick search on Dell’s GitHub shows a vast array (pun intended) of support for the container storage interface. You can find links to all drivers as well as our Container Storage Modules enhancements to CSI in this repository.  

ryanwallner_2-1668807037622.png

Dell provides drivers to the following storage platforms which include PowerFlex, PowerScale, PowerStore, PowerMax and Unity. This means Kubernetes can be used along with the storage platforms to create, delete, attach, consume, mount and snapshot volumes for your Kubernetes application pods. 

ryanwallner_3-1668807037625.png

CSI moves as fast as it can carefully planning feature updates and new releases. However, there are Kubernetes data management problems that CSI does not currently tackle. In other words, CSI gets us only so far managing storage, it does not provide APIs for operations like replication, encryption, backups, nor does it fix issues during failure scenarios such as a network partition, node failure or disk failure. That is where vendors need to specialize and add value on top of CSI, and for Dell Technologies, this is Container Storage Modules. 

What’s Container Storage Modules (CSM) 

ryanwallner_5-1668807037629.png

Think of container storage modules as an enhanced CSI layer that provides advanced data management capabilities on top of what CSI can provide delivered by Dell. CSM provides the following enhancements for CSI with Dell storage: 

Authorization: Provides storage and Kubernetes administrators with the ability to apply RBAC for Dell CSI Drivers. It does this by deploying a proxy between the CSI driver and the storage system to enforce role-based access and usage rules. 

Encryption: Provides the capability to encrypt user data residing on volumes. Volume data is encrypted on the Kubernetes worker host running the application workload, transparently for the application. 

Resiliency: Designed to make Kubernetes Applications, including those that utilize persistent storage, more resilient to various failures. 

Replication: Brings Replication & Disaster Recovery capabilities of Dell Storage Arrays to Kubernetes clusters. It helps you replicate groups of volumes using the native replication technology available on the storage array and can provide you a way to restart applications in case of both planned and unplanned migration. 

Application Mobility: Provides the ability to move their stateful application workloads and application data offsite and to other clusters, either on-premises or in the cloud. 

I hope it is a bit clearer how the evolution of CSI has provided the ability for container orchestration systems to work more seamlessly with storage providers. The community benefits from the overall standardization of storage APIs while vendors can provide support and release cycles separate from those orchestrators. Dell Technologies sees Kubernetes as a future operating plane that many organizations are standardizing on and is continuing to invest in capabilities on top of Kubernetes and CSI such as the ones in CSM that enable our community of customers and users to take advantage of. 

Come join the conversation!  
 
Community Forum 

Join Slack 

Discord Server 

Dell Technologies Developer Community 

419 Posts

December 8th, 2022 01:00

Nice writeup Ryan

Top