Generates an estimate of possible storage savings that could be achieved by packing files with
Small Files Storage Efficiency.
Usage
The
isi_sfse_assess command scans a set of files and simulates the work that
Small Files Storage Efficiency would do. This command generates an estimate of disk space savings without moving any data. It does not require a license and does not require that
Small Files Storage Efficiency be enabled.
Use this tool before enabling
Small Files Storage Efficiency to see possible storage savings. Use the tool after some file packing has occurred to identify additional possible savings given the current state of the file system.
The assessment is based on calculating the blocks saved when small files are packed into containers at the same protection level. A file is categorized as small if its size is less than the value of the max_size option. The default is about 1MB.
Many of the options in the isi_sfse_assess command mirror options available during actual packing. These are system level control options (sysctl options) with preset default values. For packing to achieve the results predicted during assessment, you must use the same settings for packing and assessment.
You can change the default settings for sysctl options used during packing with the isi_packing command.
You can change the default settings for sysctl options used during assessment with this isi_sfse_assess command.
The assessment skips the following types of files.
Non-regular files (not recorded).
Unlinked files (not recorded).
ADS files, if ads_enabled is false.
Stubbed (CloudPools) files.
Empty files (not recorded).
Zero-sized files, where all blocks have no physical content, such as shadow references, ditto, etc.
Oversized files, where the file size is greater than the max_size value.
Mirror protected files, if mirror_containers_enabled is false.
Clone/deduped files, if avoid_bsin is true.
The command reports progress as it runs by displaying the following information:
% complete.
Estimated possible space savings on the files scanned so far. This number should continually increases as the program progresses.
Estimated time remaining.
You can temporarily interrupt processing at any time using CTRL-C. The command saves its progress, allowing you to restart processing at a later time. Use the
--resume (or
-r) option to restart the processing. For details, see
Example: Stop and restart processing below.
Syntax
Usage:
isi_sfse_assess <assess-mode> [process options] [sysctl options]
Assess Modes:
-a | --all : assess all files on OneFS
-p <path> | --path=<path> : assess <path> and sub-dirs
-r | --resume : resume previous assessment
Process Options:
-q | --quick : slow mode (better accuracy)
-f <fails> | --max-fails=<fails> : max failures before aborting (default: 1000)
-v | --verbose : verbose mode
Sysctl Options:
--max-size=<bytes> : max file size to pack
--avoid-bsin[=on|off] : avoid cloned/deduped files
--mirror-translation-enabled[=on|off] : convert mirrored to FEC
--mirror-containers-enabled[=on|off] : process mirrored files
--snaps-enabled[=on|off] : process snapshots
--ads-enabled[=on|off] : process ADS files
Options
-a | --all
Scans all files across the cluster for possible storage savings. The scan includes snapshots if both of the following are true:
The
--snaps-enabled option is set to on.
The default (slow) process option is selected. If slow is not the process option, the scan is adjusted for faster processing, and snapshots are not included.
-p <path> | --path=<path>
Scans files in the named path for possible storage savings across the named directory path. This option performs a tree walk across all files and subdirectories within the named path. Because snapshots are invisible to the directory tree structure, the tree walk does not process any snapshots. Both absolute and relative path names are acceptable.
-r | --resume
Users can interrupt a running assessment using theCTRL-C keys simultaneously. This option resumes the assessment processing at the point where it was interrupted. The resumed process uses all of the same options that were specified on the original command.
-q | --quick
Slow mode is the default. Use this option to override the default and run in quick mode. The differences are:
Quick mode makes some assumptions during processing based on file and block size, as opposed to gathering actual data block information. If your OneFS system stores only regular files (no snapshots, cloned or deduped files, etc.), the results of quick mode can be very close to the accuracy achieved in slow mode.
Slow mode is more accurate but is very time-consuming. This mode collects actual data block information, including overhead blocks, and the results are precise.
-f <fails> | --max-fails=<fails>
The maximum number of failures allowed before aborting the assessment process. The default is 1000.
The command first collects a list of files to process and then proceeds with actual processing. A failure occurs when it attempts to process a file that was modified or deleted after being added to the list.These failures are more likely to occur on a busy cluster with a very large number of files.
-v | --verbose
Turns on verbose output.
--max-size=<bytes>
Sets the maximum size of files to select for processing. The default is 1040384 bytes, which is 8192 bytes less than 1MB, or 127 fs blocks. This value makes files less than 1MB available for packing.
--avoid-bsin[=on | off]
Controls whether to avoid cloned and deduped files.
The default is
on, or true, meaning that deduped files are not processed. We recommend not to pack deduped files. Packing them has the effect of undoing the benefits of dedupe. Also, packing deduped files may affect performance when reading the packed file.
NOTE:The dedupe functionality does not dedupe packed files.
--mirror-translation-enabled[=on | off]
Controls whether to pack mirrored files into FEC containers with equivalent protection. The default is
off, or false.
The
off setting ensures that a mirrored file remains a true mirror. This is an important quality for some users.
The
on setting allows packing of files with mirror protection polices into containers with equivalent FEC protection policies. The
on setting can increase space savings.
--mirror-containers-enabled[=on | off]
Controls whether to process mirrored files. The default is
off, or false.
The
off setting does not process mirrored files.
The
on setting allows creation of containers with mirrored protection policies. Mirrored files remain mirrored, so there is no space saving. However, this setting can reduce the total protection group count and potentially reduce rebuild times.
--snaps-enabled[=on | off]
Controls whether to process snapshots. The default is
off, or false.
The
off setting does not process snapshot files. Use this setting if processing time is an issue.
The
on setting processes snapshot files. This processing can significantly increase the time it takes to pack a data set if there are many snapshots with data. The advantage to using the
on setting is the storage savings that may be gained. Snapshot files are often sporadically allocated, which typically results in poor storage efficiency. Packing can improve the storage efficiency.
--ads-enabled[=on | off]
Controls whether to process ADS files. The default is
off, or false.
The
off setting does not process ADS files. Typically, these stream files are too large to be considered for packing. In addition, it is more efficient to process directories of streams files, but not efficient to process them singly from various locations.
The
on setting processes ADS files. Use this setting if you have small ADS files located together in a directory.
Example: Start assessment in slow mode on all files
The following command scans all files in slow mode.
# isi_sfse_assess -a
Example: Start assessment in quick mode on a directory
The following command uses quick mode to generate precise space saving estimates on the
/ifs/my-data directory.
# isi_sfse_assess -q -p /ifs/my-data
Example: Stop and restart processing
# isi_sfse_assess -a --snaps-enabled --mirror-containers-enabled
# <CTRL-C>
# isi_sfse_assess -r
The process resumes using all of the same options that were originally entered.