In my last blog, I explored the role of artificial intelligence (AI) in creating intelligent storage systems that learn via algorithms and make critical decisions with no human involvement. I described how the new PowerMax from Dell uses a reinforced learning model to autonomously make resource allocation decisions at high-speed while serving millions of IOPS to achieve the latency targets of mission-critical applications. Taking this a step further, I will now describe the evolution of intelligent storage systems that are truly autonomous, akin to self-driving cars. If we can build self-driving cars, can we build “self-driving” storage systems?
An autonomous car and a storage system have fundamental similarities. Consider this:
- Both are very complex systems – dealing with a TON of simultaneously occurring events happening very fast
- Both have a lot riding on them – human lives in one case and mission-critical business operations in the other that in many cases impact human lives as well
While the image most people have in their minds when they hear “self-driving car” is a vehicle that drives completely by itself, in fact there are multiple levels of automation, with Level 5 being the Holy Grail of 100% autonomous.
I have a similar gradation for storage systems. I see the journey to a fully autonomous storage system as consisting of four steps:
Level 1: Application-Centric
With the self-driving car, you tell it where you want to go, not which roads and turns to take and what speed to drive at. Similarly, the way you interact with the storage system has to be in terms of what you are trying to accomplish – the application you wish to run. You care about the application not what the storage system needs to do to run that application. You want to tell the storage system that you wish to run a web-based transaction processing application using a relational database and have it take care of the rest.
Level 2: Policy-Driven
Next, you typically want to tell the car whether to take the direct fastest route or a scenic route through the backroads. Similarly, you want to set some service level objectives for the application you just told the storage system you want to run – is it a high priority production application or a best-effort dev-test instance? Does it need additional data protection via remote copies? How frequently?
Level 3: Self-Aware
Now the car has what it needs to get driving. But if it is to drive itself, it needs to be “self-aware”. For example, it needs to know whether it is in the lane or about to stray, if it is a safe distance from the car ahead of it, running low on fuel, etc. The storage system analog is telemetry about how the system is operating – how “close to the edge.” This is where the industry has been lagging. While we have a lot of telemetry, we typically haven’t been very good at analyzing the data to determine if we’re about to drive off the cliff – we still rely on humans to figure this out. And typically, the humans get involved after something has gone horribly wrong. The first step here is to make the system self-aware – instead of throwing a whole lot of data at the human, the system should be able to analyze the data and tell the user how close it is to the edge. And that sets us up for the next and final part…
Level 4: Self-Optimizing
Once you know how close you are to the edge, the system needs to be able to adjust its behavior/operation to avoid going over the edge. In the self-driving car world, a very simple example is adaptive cruise control where the car regulates its speed to keep a safe distance from the vehicle ahead sensed by LIDAR. This is exactly what Dell PowerMax can do as shown in the image below – the algorithms are designed to detect changes in the environment and change key system behaviors accordingly to try meeting the need of the applications, and prevent the system from getting into catastrophic situations. In other words, you may want/need to drive at 65 MPH, but right now you can’t unless you change your lane.
Dell’s solutions have had application-centric and policy-driven capabilities for years now as well as rudimentary levels of self-aware/optimizing capabilities in technologies like FAST (Fully Automated Storage Tiering). The new PowerMax takes this history of innovation in building intelligent storage systems to the next level by incorporating machine learning techniques. Just like in the general AI field, we are applying these techniques to progressively more complex scenarios inside our storage systems. As the scenarios get broader, there is a need for more contextual information to make the right decisions. In the car world, an example is relying on real-time traffic updates from a global information system to choose a route that incurs minimal delay. In the storage world, we have CloudIQ – our brain in the cloud. CloudIQ observes and remembers all operational information about the storage arrays in the field. It also uses machine learning techniques to learn how each system is behaving, how the workloads it is serving are changing over time, etc. It is also looking at the entire population of systems in the field to learn patterns of behaviors and/or environmental conditions to learn and predict – and thus avoid impacting a system’s health or behavior.
Why does all this matter?
At Dell, we are on a mission to deliver the infrastructure for the next industrial revolution that accelerates human progress. AI is going to be a key tool in this mission, just like how social networking fundamentally changed the way we live, interact and work. With a long history of storage innovation, we have a unique value proposition that our customers rely on for their own innovation journey as we go on this mission together.
So, who will get there first – a fully autonomous car or a fully autonomous storage system? My bet is on the latter!