We did a series of performance benchmarking tests on an Isilon X410 cluster using the YCSB benchmarking suite and CDH 5.10.
The CAE POC lab environment was configured with 5x Isilon x410 nodes are running OneFS 8.0.0.4 and later 8.0.1.1 NFS large Block streaming benchmarks we should expect 5x ~700 MB/s writes (3.5 GB/s) and 5x ~1 GB/s reads (5 GB/s) for our theoretical aggregate maximums in any of these tests.
The (9) Compute nodes are Dell PowerEdge FC630 servers running CentOS 7.3.1611 each configured with 2x18C/36T-Intel Xeon® CPU E5-2697 v4 @ 2.30GHz with 512GB of RAM. Local storage is 2xSSD in RAID 1 formatted as XFS for both operating system and scratch space/spill files.
There were also three additional edge servers which were used to drive the YCSB load.
The backend network between compute nodes and Isilon is 10Gbps with Jumbo Frames set (MTU=9162) for the NICs and the switch ports.
The first series of tests were to determine the relevant parameters on the HBASE side that affected the overall output. We used the YCSB tool to generate the load for HBASE. This initial test was run using a single client (edge server) using the 'load' phase of YCSB and 40 Million rows. This table was deleted prior to each run.
ycsb load hbase10 -P workloads/workloada1 -p table='ycsb_40Mtable_nr' -p columnfamily=family -threads 256 -p recordcount=40000000
hbase.regionserver.maxlogs - Maximum number of Write-Ahead Log (WAL) files. This value multiplied by HDFS Block Size (dfs.blocksize) is the size of the WAL that must be replayed when a server crashes. This value is inversely proportional to the frequency of flushes to the disk.
hbase.wal.regiongrouping.numgroups - When using Multiple HDFS WAL as the WALProvider, sets how many write-ahead-logs each RegionServer should run. Results in this number of HDFS pipelines. Writes for a given Region only go to a single pipeline, spreading the total RegionServer load.
This next test was to do some more experimenting in finding what happens at scale so I created a one Billion-row table, which took a good hour to generate, and then did a YCSB run that updated 10 million of the rows using the 'workloada' settings (50/50 read/write). This was run on a single client, and I was also looking for the most throughput I could generate so I ran this as a function of the number of YCSB threads. One other note was that we did some tuning of Isilon and went to OneFS 8.0.1.1 which has performance tweaks for the Data node service. You can see the bump up in performance compared to the previous set of runs. For these runs, we set the hbase.regionserver.maxlogs = 256 and the hbase.wal.regiongrouping.numgroups = 20
The next test was to determine how the Isilon nodes (five of them) would fare against a different number of region servers. The same update script ran in the previous test was run here. A one Billion-row table and 10 million rows updated using 'workloada' with a single client and YCSB threads at 51, We also kept the same setting on the maxlogs and pipelines (256 and 20 respectively).
The last series of tests come from that deep dark place that makes you want to break the system you are testing. After all, it is a perfectly valid scientific method to ratchet a test up until things break and call thereby knowing what the upper limit on the parameters being tested are. In this series of tests, I had two additional servers that I could use to run the client from, in addition I ran two YCSB clients on each one allowing me to scale up to six clients each driving 512 threads, which would be 4096 threads overall. I went back and created two different tables one with 4 Billion rows split into 600 regions and one with 400million rows split into 90 regions.
As you can see, the size of the table matters little in this test. Looking at the Isilon Heat charts again you can see that there is a few percentage difference in the number of file operations mostly inline with the differences of a four Billion-row table to 400 Million rows.
HBase is a good candidate for running on Isilon, mainly because of the scale-out to scale-out architectures. HBase does a lot of its own caching, and is splitting the table across a good number of regions you get HBase to scale-out with your data. In other words, it does a good job of taking care of its own needs, and the file system is there for persistence. We were not able to push the load tests to the point of actually breaking things but if your looking at four Billion rows in your HBase design and expect 800,000 operations with less than 3 ms of latency this architecture supports it. If you notice that I did not mention much more about any of the myriad other client side tweaks you could apply to HBase itself, I would expect all those tweaks to still be valid, and beyond the scope of this test.