Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

ECS 3.6.2 Data Access Guide

PDF

ECS HDFS Introduction

ECS HDFS is a Hadoop Compatible File System (HCFS) that enables you to run Hadoop 2.x applications on top of the ECS storage infrastructure.

NOTE: This chapter, as well as Appendix A, B, C and D, discuss the native HDFS in ECS, also known as ViPRFS, and does not refer to accessing ECS through S3A.

When using ECS HDFS, the Hadoop distribution is configured to run against the ECS HDFS instead of the built-in Hadoop file system. The following illustration shows how ECS HDFS integrates with an existing Hadoop cluster.

Figure 1. ECS HDFS integration in a Hadoop cluster
ECS HDFS Integration

In a Hadoop environment that is configured to use ECS HDFS, each of the ECS nodes functions as a traditional Hadoop NameNode and DataNode, so that all of the ECS nodes can accept and service HDFS requests.

When you set up the Hadoop client to use ECS HDFS instead of traditional HDFS, the configuration points to ECS HDFS to do all the HDFS activity. On each ECS HDFS client node, any traditional Hadoop component would use the ECS Client Library (the ViPRFS JAR file) to perform the HDFS activity.

To integrate ECS HDFS with an existing Hadoop environment, you must have the following:

  • A Hadoop cluster that is already installed and configured. The following distributions are supported:
    • Hortonworks HDP 2.6.2
  • A Hadoop cluster that is installed and configured to support ECS HDFS, which requires:
    • A file system-enabled bucket for HDFS access.
      NOTE: Only one bucket is supported per Hadoop cluster and the ECS HDFS must be the default file system.
    • The ECS Client Library that is deployed to the cluster.
  • For a Hadoop cluster that uses Kerberos or Kerberos with Active Directory.
    • Kerberos configuration files and service principal keytab files that are deployed to the ECS cluster.
    • Secure metadata that is deployed to the bucket.

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\