From adapting energy use to maximizing data consolidation, Big Data (BD) analytics has taken the guesswork out of optimizing the modern data center.
More than ever, the modern data center is a living, changing environment, with new technologies coming in, old technologies being cycled out, and evolving energy efficiency strategies to keep it all humming. We have to make sure we have the space and power to install the latest technology, while we still have the old equipment in place.
Up until recently, orchestrating this shifting ecosystem was only partially data-driven and the rest was based on gauging changing needs from past experience. At EMC IT—like most IT organizations—we had long tracked metrics on our data center facilities, including space, power, cooling, humidity, temperature, etc. And we collected storage data—server utilization, virtual machines, growth trends. But we lacked the tools to process this vast amount of data and we were never able to aggregate this information into one data base.
Today, our new data lake implementation lets us continuously analyze thousands of data points about the data center facilities and IT infrastructure to get real-time visibility into the data center operations. Since last year we have started using a new set of dashboards that have saved millions of dollars in energy costs and shaved years off our data center expansion needs.
Optimizing on Two Fronts
EMC IT started an optimization effort for our 20,000-square-foot Massachusetts data center in 2013. We approach this in two ways: eliminating all the individual rack-mount servers by consolidating our IT infrastructure onto VCE Vblock converged infrastructure; and at the same time optimizing our power and cooling utilization.
The effort was spearheaded by a small data optimization team that was an offshoot of what we had learned when we moved our main data center from Massachusetts to a brand new 20,000-square-foot facility in Durham, NC, in 2011.
Using Big Data analytics, the team was able to produce historical and predictive trends for both facilities infrastructure and IT infrastructure in the IT Data Centers and merge them together. We then worked to determine which opportunities were our top priorities for optimization and created series of dashboards using Tableau BI to provide real-time views of the data center infrastructure, IT infrastructure and customer footprints (views of the data center resources in use by each individual business unit).
With this new vantage point, we have been able to make some smart moves by further consolidating on VCE Vblock’s, reducing our footprint, and cutting power utilization.
In 2013, we had determined that if we did nothing to optimize our Massachusetts data center, we would need to expand it by the end of 2014. This would require significant added floor space at a substantial investment. However, through our Big Data analytics guided moves, we have actually extended the life of our current data center in Massachusetts by two to three years.
In addition to analyzing overall data center metrics, we can use these new tools to provide individual users with insights into their data center use. For example, we can specify how many racks our SAP ERP project is using and what that project is paying in chargeback costs. That lets us make recommendations to users on how they can be more efficient.
Metrics That Matter
We are continuing to hone metrics and dashboards for our Massachusetts facility as well as other data centers in Durham, Santa Clara, and Bangalore. And we continue to introduce new technologies, including flash-optimized EMC VMAX and XtremIO flash-based scale-out storage arrays.
At the same time, we are tracking when our older servers will be reaching the end of their five- to-seven-year lives—shifting them from running mission critical to non-mission critical applications in preparation for replacing them. Our data center optimization team is continuously moving VMs and environments in the data center to maximize capacity and performance and minimize cost.
On the facilities side, we can use Big Data analytics to guide efficiency initiatives such as outside free-air cooling. We have been able to analyze five years of temperature data in our cold aisle containment areas (areas that enclose our servers to help them stay cool) to come up with a predictive analysis of when we can effectively use outside air (free air) to help cool our data centers to save significant costs. Our Durham data center leverages this “free” air 57 percent of the time.
Analytics also help us track the effectiveness of circulating cooling water from our Hopkinton data center through an outdoor metal plate area to help reduce cooling costs.
Changing the Conversation
Having all this real-time data available in the data lake not only helps us to optimize our data centers, it also changes the conversation we have with IT and corporate leaders regarding data center investments and with our IT clients.
Traditionally, when the data center team requested incremental investment in facilities or IT infrastructure, we would need to spend time pulling data from various sources, massaging it, and putting it into spreadsheets and graphs to address executives’ questions. Now we have it in real time. The due diligence is done and the metrics are clear to make decisions.
At the same time, these new tools let us elevate our conversation with our business users by providing them with numbers about the data services, costs and options that matter to them. Previously, individual organizations would do their best to create their own sets of metrics. Now we have the ability to share vast amounts of real-time data center metrics via the data lake so everyone can make informed decisions. Some metrics will matter to the CIO, others to the cloud infrastructure team, and still others to individual teams looking to see how their platforms are running.
Overall, having a real-time view of our data center resources enables us to make the most of this changing environment by making the right decisions at the right time.