Unsolved
This post is more than 5 years old
14 Posts
0
1412
case study for celerra de duplication
Hi,
We are planning to enable de duplication on several file systems in our environment. The file system size varies upto 4 TB.
We are bit concerned on what performance impact it might cause.
Wanted to check if there is any case study available for enabling de duplication and the results
i went through enabling de duplication on celerra PDF, but didnt help much
appreciate any help on this...
Thanks...
Vanitha
jithin
91 Posts
0
February 19th, 2014 06:00
There is no case study as such for this as it varies on the IO workload on the datamovers. you can select the duration where there is no writes/reads on the DM's and enable dedupe during that time and suspend the scan during peak hours and then start it back during off peak/business hours. Or you can try running the scan at the peak hours and leave it unchanged for short duration to monitor how much impact it have on the performance and decide accordingly.
Jithin
umichklewis
1.2K Posts
1
February 19th, 2014 06:00
If no one can supply a case study, you may have to rely on anecdotal evidence. I've used deduplication on both the older hardware on DART 6.0.42 and newer VNX with DART 7.1. When deduplication is enabled, the NAS does not begin compression until CPU is below the threshold you specify. That is, if specify 60% CPU, it will not start unless the CPU is less than 59% busy.
When deduplication is running on the older hardware (NS80 and NS40), I've observed 15 to 20% greater CPU utilization when scanning filesystems with lots of small files (perhaps 2-3Million, most under 200k or so). Deduplication runs for two or three hours and then it's done.
When deduplication is running on the newer hardware (VNX 5500), CPU utilization was 25% or 35% higher for similar filesystems, but the deduplication completed in a half hour.
In neither case, could I discern a performance difference. Double-clicking files didn't generate the Windows hourglass, and operations seemed normal.
As always, your mileage will vary...
Rainer_EMC
8.6K Posts
1
February 19th, 2014 07:00
Apart from looking at the manual
Data mover CPU impact should be minimal and its designed to “back off” if there is other work to do.
So if your DM is busy the dedupe scan will just take longer.
Nothing to worry about – just be patient
Try to stay with the default settings – the idea is that data that hasn’t been modified for some time is not likely to change so it will be compressed.
If you change that too aggressively you should consider the impact of changing a compressed file.
Due to way the file dedupe works you want to have some free space available in the file system and savvol.
In simple terms we first compress the file and then remove the uncompressed one.
Vanitha13
14 Posts
0
February 20th, 2014 05:00
Hi,
Thank you for your inputs
Regards
Vanitha