Tutorial on Storage Pools for ScaleIO

Publication le sept. 05, 2024

This video will overview on Storage Pools for Scale IO.

A storage pool is an elastic collection of physical drives belonging to a group of standard X 86 64 servers running scale IO storage pools are software defined and provide enormous operational and life cycle advantages over traditional block storage operating systems. And hypervisors need data storage sands, abstract block storage using lens in a storage array. Arrays are nice because they provide reliability, performance and capacity aggregation. Lungs are nice because they are os agnostic and can support multiple operating systems. And hypervisor clusters provides a san alternative that abstracts data storage across a group of standard servers using what is called a storage pool.

A storage pool is a set of hard disks or flash media distributed across a group of X 86 64 servers connected over Ethernet cal takes the local storage from the servers, abstracts it and pulls it and creates a redundant high performance distributed and shared storage system. This allows the applications on all the servers to utilize the storage on all the servers storage pool. Provisioning is managed through scale of volumes. Volumes are analogous to Luns. They are os agnostic and support all the major operating systems. And hypervisors cal storage pools provide the benefits of a shared external storage array but with simplified life cycle operations, superior scalability and flexibility and no expensive or proprietary hardware and networking.

In this video, we will illustrate the advantages of storage pools and show a demo on a scale IO system involving the addition and removal of physical storage devices from storage pool. Let's get started consider six typical servers in a data center. All running scale IO these nodes may be from any server vendor or they may be custom built. This visualization shows two U DELL scale IO ready nodes with 10 gigabit Ethernet connections and direct attached storage. When the drives in these systems are added to a single storage pool scale will abstract them and pull them. This creates a globally shareable pool with no silos of performance or capacity.

This example shows a three nodes scale IO cluster where each node contains eight S sds and 16 spinning drives. Since scale IO aggregates performance and capacity, this system would likely have two storage pools, one consisting of flash media and the other consisting of spinning media. As this example shows scale IO nodes may contribute storage to multiple storage pools at one time. This example also illustrates how storage pools can support any media type including hard disks, S SDS and PC IE flash. Each pool maintains its own even data distribution, client IO distribution and available free space hardware based read and write caches as well as ram read caches are shared and aggregated along with the underlying media.

This results in a very high performance distributed storage system storage pools do not need to be homogenous across servers. For example, the third node in this illustration might contribute five S sds to the all flash storage pool. Instead of eight storage pools allow complete flexibility over deployment of your hardware resources. This video however illustrates homogenous storage pools For visual simplicity, storage pools allow an administrator to dynamically move physical drives as business requirements change or when bursting capability is needed. This six node cluster has two storage pools made up of spinning media, one dedicated to a finance group and another dedicated to a marketing group.

Finance and marketing. Both use the storage continuously for their applications. Finance however, needs to burst storage performance for end of quarter processing, some of the physical media that makes up the marketing storage pool can be moved to the finance storage pool. This will increase the performance of the finance storage pool. These servers contain S SDS as well as spinning media. These S SDS are available so they can also serve as a read cache. Addition of the read cache will further accelerate the finance storage pool. Once the end of quarter processing is complete, the administrator can return the storage resources back to their original configuration scale IO balances user data evenly across all the drives that make up a storage pool in this three note example, all the drives are part of a single storage pool and they're about 70% full.

A new note is added when the drives are added to the storage pool, all the nodes and devices taking part in the storage pool work in parallel to balance the user data across the drives. The majority of the data remains in place on the old drives. A minimal amount of data is moved to populate the new devices added to the storage pool. This even data distribution coupled with the scale IO clients built in multi pathing eliminates bottlenecks hotspots and silos this balanced and mesh design results in very high throughput and very low latency as additional nodes are added and their drives are added to the storage pool. This system can recover from failures more rapidly as there is proportionally less data to reconstruct if a node or drive is lost and there will be more nodes working in parallel to reestablish data protection.

When drives are manually removed from the storage pool, a similar operation migrates user data to the drives that will remain in the pool. Once the data has been migrated off the drives, empty nodes can be removed, unplanned node or drive failure results in a similar self healing operation. This self healing operation reestablishes data protection in the same distributed balanced and minimal way because storage pools maintain the structure and protection of user data as physical drives are added or removed. A scale IO system can expand or contract at any time. Storage or storage and compute can be added to the system quickly as demands increase. Similarly, as hardware is decommissioned or if requirements decrease nodes can be removed.

These streamlined life cycle operations allow enterprises to achieve the same operational efficiencies that hyper scale cloud providers enjoy storage pools, reduce installation and deployment timelines. Eliminate large upfront investments for future usage, provide predictable performance and resiliency at scale and reduce operational cost and risk storage pools can span multiple racks and they reside inside protection domains. Protection domains are groups of servers that provide fault isolation in a scale a cluster which may consist of thousands of nodes if you're interested in learning about protection domains, consider watching our video on them. After you finish this video, this video features an eight node scale IO cluster running on ESX. It has three storage pools A B and C.

In this demo, we will assume that each storage pool houses data for a separate tenant with marketing using storage pool A and finance using storage pool B storage pools A and B reside in Protection Domain one, this protection domain is made up of all flash nodes storage pool C which we will not modify in this video resides in Protection domain too. Protection Domain two is made up of notes that house spinning media at the start of this demo storage pool A will be 22% full and storage pool B will be 76% full. With the system under load, we will move flash capacity from the marketing storage pool A to the finance storage pool. B first, we will remove five S SDS from storage pool.

A, the storage pool will then rebalance migrating user data and data protection off the drives to be removed. The capacity of storage pool A will decrease, it will go from 22% full to 27% full. Once the S SDS have been evacuated, we will add them to storage pool. B storage full B will then rebalance distributing user data and data protection evenly across the drives. The capacity of storage pool B will increase. It will go from 76% full to 38% full. Note that on a production system of this size, all the S SDS in the first five nodes of the cluster would likely be in the same storage pool as scale IO supports Q OS and A volume can be used by any hypervisor operating system or tenant, regardless of which storage pool it resides in. We'll begin the demo on the scale IO dashboard. Here we see the total capacity available to the cluster, the number of protection domains and the number of storage pools.

This system is actively servicing a workload consisting of both reads and rights. Let's take a more detailed look at the storage pools. Storage pool. A is about 22%. Full storage pool B is about 76% full storage pool C resides in protection domain two and it's about 20% full storage pool A is using five S SDS from each of the five nodes that make up the protection domain. For a total of 25 S SDS capacity I OS and redundancy are evenly distributed across flash drives in the storage pool. Every physical drive is evenly utilized as the capacity icons illustrate storage pool B consists of five S SDS, one on each of the five nodes that make up the protection domain.

As with storage pool, A capacity I ops and redundancy are evenly distributed across every flash drive in the storage pool. But unlike storage foa these devices are nearing their capacity limits with scale io the physical storage devices are all under software control. We can easily move devices from the underutilized pool to the over utilized pool where that capacity will be needed to do this. We're going to remove some S SDS from storage pool. A, this will free up the S SDS for addition to storage pool B. As we remove the devices, the data will be rebalanced across those devices that remain in the pool. Let's remove five S sds from the first storage pool. This warning message tells us that the physical device will still be mapped to the scale IO virtual machine even though we are removing it from the storage pool. That's what we want. So we click. OK. And continue with the remaining devices.

During the rebalance operation, all the nodes contributing storage are rebalancing to all the other nodes contributing storage, the individual discs in each of these nodes are also working in parallel to speed up the operation. The rebalance operation is complete storage pool. A the marketing storage pool now consists of four drives from each of the five servers for a total of 20 S SDS. The system is continuing to serve IO and as before load capacity and redundancy are distributed across all the drives in the storage pool. Storage pool A is now about 27% full and we have five available S SDS to assign to the finance storage pool storage pool B, we will enter the name of the device in the destination storage pool for each of the S SDS.

CAL IO will now add these new S SDS to storage pool B and begin a rebalance operation to level out the utilization of the storage pool using both the old and new S SDS. The new SSD S have been added to storage pool B and A rebalance operation has evenly distributed the data across the SDS storage pool A is now about 27% full and storage pool B is about 38% full. The marketing department's underutilized capacity was moved to the finance department without manual rebalancing. The user data and data protection relationships remained intact. While the S SDS were moved from one storage pool to another with the system under load. Similar procedures are used when adding new nodes to a scale IO cluster, removing notes from a scale IO cluster, repurposing nodes or storage devices or bursting a workload storage pools contain client volumes that are analogous to Luns.

These volumes can be accessed by the operating systems or hypervisors contributing raw storage or the volumes can be accessed by other servers. As with Luns, these volumes can be used by any group of hypervisors or bare metal machines. Storage pools are elastic allocations of physical drives. They can grow or shrink as demands change as hardware is acquired or as hardware is decommissioned storage pools, aggregate flash media or spinning media providing enormous capacity and performance. They are media agnostic and therefore future proof. They support spinning media S SDS and PC IE flash and are designed to take advantage of emerging forms of persistent storage storage pools can provide shared distributed read and write caches fully and evenly utilizing the hardware available to the system storage pools, evenly distribute client io and capacity.

This provides consistent performance for the applications that use the storage and it eliminates hotspots. Storage pools are self balancing. This reduces operational burdens associated with data growth and new system procurement and provides a rapid return on new hardware investment. Finally, storage pools are software defined and can be modified on the fly. They are not subject to the constraints inherent to hardware based approaches. Storage pools are a scalable and flexible alternative to traditional storage architectures and bring the efficiency of hyper scalar to the enterprise.

Vidéos suggérées

How to replace a ScaleIO 14G R740 System Board

How to replace a ScaleIO 14G R740 System Board

10:35

How to replace a ScaleIO 13G R730 Caching Disk

How to replace a ScaleIO 13G R730 Caching Disk

5:15

How to replace a ScaleIO 13G R730 Integrated Storage Controller Card

How to replace a ScaleIO 13G R730 Integrated Storage Controller Card

11:51

How to replace an Infinite Architecture for the ScaleIO

How to replace an Infinite Architecture for the ScaleIO

11:01

Articles connexes