Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

Data Domain: Scheduling Cleaning on a DDR

Summary: The filesys clean operation reclaims physical storage occupied by deleted objects in the Data Domain file system.

This article applies to   This article does not apply to 

Instructions

Scheduling Cleaning on a Data Domain System

PURPOSE

The filesys clean operation reclaims physical storage occupied by deleted objects in the Data Domain file system.

When application software expires backup or archive images and when the images are not present in a snapshot, the images are not accessible or available for recovery from the application or from a snapshot. However, the images still occupy physical storage.

Only a filesys clean operation reclaims the physical storage used by files that are deleted and that are not present in a snapshot. The file system may never report 100% cleaned. The total space cleaned may always be a few percentage points less than 100.

APPLIES TO

  • All Data Domain Systems
  • All Software Releases
  • Cleaning

SOLUTION

Data Domain recommends running a clean operation after the first full backup to a Data Domain System. The initial local compression on a full backup is generally a factor of 1.5 to 2.5. An immediate clean operation gives additional compression by another factor of 1.15 to 1.2 and reclaims a corresponding amount of disk space.

A default schedule runs the clean operation every Tuesday at 6 a.m. (tue 0600) with 50% throttle.

To increase file system availability, and if the Data Domain System is not short on disk space, consider changing the schedule to clean less often.

Issues that can affect the cleaning process:

  • If the system is filling up, changing default values to more frequent or aggressive cleaning cycles should not be used to compensate for this. Running cleaning every day will fragment the data. For example, read speed can be severely impaired. A global compression algorithm depends on good locality during writes so too frequent clean cycles will in addition bring de-duplication numbers down.
  • Cleaning is a file system operation that impacts overall file system performance while it is running. Changing the cleaning throttle higher from the default of 50 will have an impact on performance during the active cleaning cycle as the cleaning process consumes more resources.
  • Changing the local compression algorithm causes the following cleaning cycle to run longer as all existing data must be read, uncompressed, and compressed again.
  • Any operation that shuts down the Data Domain System file system or powers off the device (a system power-off, reboot, or file system disable command) stops the clean operation. The clean does not automatically continue when the system and file system starts again.
  • Replication between Data Domains can affect filesys clean operations. If a source Data Domain receives large amounts of new or changed data while disabled or disconnected, resuming replication may significantly slow down filesys clean operations.
  • If the directory replication is running behind, for example due insufficient network bandwidth between the replication pairs (resulting to a replication lag) cleaning may not be able to run fully. This condition requires either replication break (and resync once cleaning has run) or replication lag to catch up (such as increasing network link or writing less new data to the source directory).

A Data Domain that is full may need multiple clean operations to clean 100% of the file system, especially when one or more external shelves are attached. Depending on the type of data stored, such as when using markers for specific backup software (filesys option set marker-type ...), the file system may never report 100% cleaned. The total space cleaned may always be a few percentage points less than 100.

With collection replication, the clean operation does not run on the destination. With directory replication, the clean operation must be run on both the source and destination Data Domain.

To display the current date and time for the clean operation, use the filesys clean show schedule operation.

# filesys clean show schedule

# filesys clean show schedule

 

To display the throttle setting for cleaning operations, use the filesys clean show throttle operation. Changes to the throttle setting take effect without restarting cleaning.

filesys clean show throttle

filesys clean show throttle

Changing the Scheduled Cleaning

To change the date and time when clean runs automatically, use the clean set schedule operation. The default time is Tuesday at 6 a.m. (tue 0600). The operation is available only to administrative users.

  • Daily runs the operation every day at the given time (Not recommended).
  • Monthly starts on a given day or days (from 1 to 31) at the given time.
  • Never turns off the clean process, and does not take a qualifier.
  • With the day-name qualifier, the operation runs on one or more given days at the given time. A day-name is three letters (such as mon for Monday). Use a dash (-) between days for a range of days. For example: tue-fri.
  • Time is 24-hour military time. 2400 is not a valid time. mon 0000 is midnight between Sunday night and Monday morning.
  • The most recent invocation of the scheduling operation cancels the previous setting.

The command syntax is:

filesys clean set schedule daily time filesys clean set schedule monthly day-numeric-1 [,day-numeric-2,...]time filesys clean set schedule never filesys clean set schedule day-name-1[,day-name-2,...]timeFile System Management 223 Clean Operations

For example, the following command runs the operation automatically every Tuesday at 4 p.m.:

# filesys clean set schedule tue 1600

# filesys clean set schedule tue 1600

To run the operation more than once in a month, set multiple days in one command. For example, to run the operation on the first and 15th of the month at 4 p.m.:

# filesys clean set schedule monthly 1,15 1600

# filesys clean set schedule monthly 1,15 1600

To set the clean schedule to the default of Tuesday at 6 a.m. (tue 0600), the default throttle of 50%, or both, use the filesys clean reset operation.

filesys clean reset {schedule | throttle | all}

filesys clean reset {schedule | throttle | all}

REFERENCES

 

Additional Information

See this published video:

Affected Products

Data Domain

Products

Data Domain