This post is more than 5 years old
8 Posts
0
18958
what is data domain cloud tier and data domain active tier also how backup flows between these devices.
what is data domain cloud tier and data domain active tier also how backup flows between these devices?
James_Ford
30 Posts
6
March 20th, 2017 07:00
So Jonathan is 100% correct but just to add some more details:
- All Data Domain systems (DDRs) have an active tier - this is the default storage/tier which exists when the file system is first created - a number of the smaller models only support an active tier
- Certain models/features can support additional tiers, for example:
Extended Retention: Allows additional locally attached storage to be added as an 'archive tier' for long term retention of a subset of data
Long Term Retention/Cloud tier: Allows object storage from a supported cloud provider to be added for long term retention of a subset of data in the cloud. Note that if you use a cloud storage you also need some locally attached storage used for metadata - confusingly this locally attached storage is placed in what is known as a 'cloud tier' but doesn't actually hold data - just metadata which describes what is on object storage - the amount of locally attached storage required for each platform is detailed in DDOS release notes/administration guides
- On all DDRs any new data sent to the DDR is initially written to the active tier
- Assuming you are using extended retention or cloud tier then for every mtree where you want data to migrated to this alternate tier you configure a data movement policy in terms of days - for example 90 days
- Periodically a data movement process starts which looks for files in mtrees with a data movement policy set where:
The file is on the active tier
The modification time (mtime) of the file is sufficiently old such that (current time - mtime) > data movement policy
These files are 'candidates' for migration and are basically copied out to the alternate tier of storage (note that this copy is de-dupe aware so only physical data which isn't already on this alternate tier is actually copied)
Once the copy of a file is complete it is 'installed' in the alternate tier of storage meaning that it now physically exists on that tier of storage and not the active tier
The next time active tier clean runs (by default once a week) space which was being used by files which have been migrated can be reclaimed
- All of this is completely transparent to backup applications/users - for example when listing the contents of a directory all files are still shown regardless of the tier in which they physically exist (DDFS maintains a single name space across tiers) - the only major difference comes when you try and read the data:
Archive tier: You can read from the archive tier directly so this really is completely transparent
Object storage: You cannot read directly from object storage (this will cause an I/O error) - instead files needing to be read have to be manually recalled back to the active tier (i.e. a reverse copy) before they can be read
This can mean that if you are using a backup application which is not 'long term retention/cloud aware' you might get failed restores when attempting to restore from files which are on object storage
- Note that, at the time of writing, only Networker 9.1 and Avamar 7.4 are fully cloud aware so can perform the recall of files from object storage -> active tier automatically before starting a restore. Long term retention aware software can also perform backup application directed data movement where the backup application decides what goes to the cloud when instead of this being based on the age of files. Other applications (such as NetBackup) are tested for basic compatibility but you should understand that if you use these applications then until such time as the vendor chooses to make them 'long term retention aware' things such as recalls have to be done manually
Hopefully this helps
Thanks, James
jbrooksuk
208 Posts
2
March 17th, 2017 02:00
The question is quite broad but...
Active Tier is the main backup storage location, present in all Data Domain systems - essentially the filesystem.
Cloud Tier is local metadata that holds knowledge of all data that is sent to the cloud unit (ECS, Amazon, Azure or Virtustream), the intention of this locally connected metadata is to negate the requirement to read from the cloud unit to 'lookup' where data resides, this knowledge allows efficient cleaning of the cloud unit and allows only reads/restore of the data you actually want to recall from the cloud unit back to Active Tier for restore of data to your backup application.
Regards, Jonathan
rohit_kumar8
8 Posts
0
April 20th, 2017 23:00
Thanks for value able information.
vishal_in
1 Message
0
May 10th, 2017 00:00
Very informative and detailed answer.