Hadoop customers have just received some exciting news with the launch of Dell’s Cloudera Enterprise 5.3 Reference Architecture. Built on Dell’s 13th generation PowerEdge R730xd servers, the new architecture provides several new and improved options that are exciting to those of us who work with Hadoop solutions every day and for customers, who will discover an even more user-friendly and secure architecture. Having worked with reference architecture since 2011, this is the sixth platform Dell has validated with Hadoop.
Improvements to Apache Spark
The improvements in Apache Spark integrate several frameworks into a single platform. This includes Spark 1.2, Hadoop Distributed File System (HDFS) caching integration, and an improvement in batch process with Alpha builds of Hive-on-Spark .
Release of Impala 2.0
Impala 2.0 enables highly interactive operational business intelligence and data discovery solutions. It also drives better batch processing with Spark as the processing engine.
Improved Security
This latest architecture also provides improvements to Hadoop security. Along with helping to prevent potential data breaches, these improved security measures make Hadoop enterprise an “all-in-one” solution for customers in all industries, especially those in more traditional IT settings. Unified authorization across all access tools is now available, and with added HDFS integration to Sentry, permissions can be set a single time with enforcement across Impala, Hive, Search, and HDSF all done by Sentry. Additionally, with Couldera Enterprise, encryption is directly integrated with Navigator Key Trustee for enterprise-grade key management.
Other security improvements include:
- Auditing and lineage through Cloudera Navigator
- At rest, end-to-end encryption providing critical separations of duties, preventing HDFS administrators from having full access to unencrypted data or sensitive material
- Marrying Sentry and the security automation offered by Cloudera Manager, it is the only PCI security-compliant platform
Additional offerings
- Updates to Cloudera Search
- First of its kind self-service tool for deploying and managing Hadoop in the Cloud
- The ability to deploy directly from Microsoft Azure Marketplace, and integrate with SQL Server, Power BI, and Azure Machine Learning
Dell’s 13G PowerEdge R730xd
The Dell PowerEdge R730xd is designed with a wide range of configurability to meet needs of many different workloads. With the latest Intel® Xeon® processor E5-2600 v3 product family, 24 DIMMs of high-performance DDR4 memory and a broad range of local storage options, it provides a flexible and scalable, two-socket 2U rack server delivering high performance processing and a broad range of workload-optimized local storage possibilities, including hybrid tiering.
The new options offered by Cloudera Enterprise 5.3 are not just the next level in providing customers with tested and validated Hadoop with Cloudera software on systems, but an even more user-friendly and secure method to bring Hadoop to a greater number of users.