This blog is co-authored by Dolton John, Senior Director, Database Engineering, Dell Digital and Ravi Kulkarni, Director, Database IT.
Databases are at the core of IT applications and ecosystems, and at Dell this is no exception. Each day at Dell we see an average of 10 new databases spun up to support the business. But with this tremendous growth comes some challenges. In the past, when developers needed a database, they would file a request with the IT team and it could take up to 30 days between the time the request was filed and the database was ready. Developers wanted a way to get up and running faster so they could do their work. On the IT side, as the volume of requests increased it was becoming difficult to keep up. It wasn’t a great experience for anyone.
So we came up with a solution. Earlier this year we rolled out database-as-a-service for our developers. This self-service capability, not unlike what’s possible in the public cloud, lets them independently provision a database and be up and running within 45 minutes. To go from waiting a month for a database to less than an hour thrilled our developers, and on the IT side we can now take the time we spent managing database requests and focus on other critical tasks which drive the business forward.
Better (and safer) databases through automated patching
Having had great success with rolling out database as a service, we decided to tackle another, bigger challenge. Keeping data safe is a top priority of any business, Dell included. And when it comes to database management, few things matter more than keeping systems running with the most up-to-date security patches. Yet in most data centers, due to the staffing, scheduling and downtime it requires, patching is done on an infrequent basis. Patching also has implications for the business—even if you take an important database offline in the middle of the night it can inconvenience someone, and in the case of databases with a 24/7 availability requirement, an offline option is simply not possible. But on the contrary, delaying the patching process can expose databases to unacceptable risks. We needed to find a way to patch on a more frequent basis with as little downtime as possible.
It’s these challenges that led us to our Oracle patching self-service model. This model uses automation to roll out timely patching with little to no downtime for high availability and increased security. Here’s how we did it.
More frequent patching with near zero downtime
First, we adopted cloned home patching, a method where patching is performed by creating a copy of the Oracle database home, applying the patches to the copied home and then switching services to that copied home. This patching method allows the database to keep running while patches are being applied, thereby either minimizing or eliminating downtime entirely.
Using Oracle Flex Automatic Storage Management (ASM) significantly improves the patching process. With the cloned home patching method, patched database homes can be maintained efficiently with little incremental storage needed and in Real Application Clusters (RAC) configurations where we’re running databases across multiple servers for high availability, cloned home patching means we can eliminate downtime entirely.
All of this was a great improvement, but IT efficiency and delivery improves when as many processes can be automated and made as effortless as possible. As a result, we automated this entire newly developed patching methodology using Shell Script with Ansible wrapper and orchestrated using AWX.
What’s next
The results have been everything we hoped for. Before, we were patching every six months to a year with anywhere from two to six hours of downtime—never a good thing when business-critical functions are running in your databases. Now we patch every three months with near zero downtime.
For our teams, both self-service database provisioning and automating patching have had profound positive impacts. Before, teams expected to wait up to a month before they could get their databases in their hands. They needed to plan schedules around expected downtime every six months which might take anywhere from two to six hours each time. Now they don’t have to worry about either of those things. Developers can have a database in their hands within 45 minutes, and patches happen, in many cases, without any downtime implications. And security updates are happening more frequently than ever before. For our developers, they’re thrilled they can be more productive than ever, and for our IT teams, they’re loving all the ways they can add value to the business.
You can learn more about what we do at Dell Digital by visiting Our Digital Transformation.