The Rationale for an Expanded Data Lake

In January, I described 2015 as the year in which data lakes come of age and move into mainstream IT. Since then, we have seen the impact of unstructured data growth on organizations across every industry. Data lakes that consolidate and eliminate storage silos to lower costs and harness the power of data assets are more appealing than ever, but it’s not a static picture.

Data growth is pervasive and, for many of our customers, it’s being generated continuously in every corner of their business. The cloud is an ever more important element in enterprise data storage strategies. We are also living in an “always on” world, demanding access to data anytime.

Location is Critical

In real estate, it is often said that the three most important things are location, location, location.  In an environment of pervasive data growth, this mantra applies to managing data with an enterprise data lake too.  As organizations expand geographically and data is generated outside of a central datacenter, the data lake needs to consolidate data across the entire enterprise, eliminate silos and enable data analytics no matter where the data is created.  It’s not just about managing data growth. It’s also about how best to manage data based on where that data is produced.

Leveraging the Cloud

Organizations are looking to public, private and hybrid cloud strategies to lower costs and relieve some of the pressure of unrelenting data growth. Their question is how best to integrate cloud alternatives with their on premises data storage infrastructure so that it is transparent to users and applications alike.

These are important considerations for an organization with a data lake environment.  Just as a data lake needs to extend to all of the locations where data is generated, it is also vital to extend to wherever data is stored in the cloud. In this way, it can enable an organization to optimize storage resources, simplify data management and harness the value of all of their enterprise assets.

Availability Is Key

Organizations also need reliable, 24×7 data availability to support their business. They can’t afford operational downtime for software upgrades or infrastructure expansions. To address these requirements, the core data center supporting an enterprise data lake must be resilient, providing continuous operations and data access.

The Expanded Data Lake Imperative

Meeting these demands requires innovation from storage providers and enterprises alike along with a more holistic data lake strategy. Extending the data lake beyond the core data center to enterprise edge locations is essential because it is where data is increasingly generated and captured. It is also vital that a data lake have the flexibility to embrace a variety of cloud storage options seamlessly. At the same time, the data storage infrastructure utilized by the enterprise data lake at the core data center must be resilient, enabling continuous operations by the business.

Addressing these challenges is a key focus for EMC right now. Our goal? Keep enhancing the value of the enterprise data lake.

About the Author: CJ Desai