Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

PowerScale OneFS 9.6.x.x CLI Administration Guide

PDF

Datamover definitions

The following concepts are fundamental to understanding the workflows supported by the Datamover (DM) transfer engine.

Datasets

  • Datasets are self-contained, independent entities that are assigned globally unique IDs when they are created.
  • Datasets are backed by file system snapshots on PowerScale clusters.
  • Unlike snapshots, which are cluster-specific, datasets can "fan-out" to multiple clusters and cloud platforms. Datasets enable multiple topologies, such as copy, fan-out, and chaining of data transfers between systems.
  • Datasets have parent-child relationships on every system. A handshake between systems determines the exact changeset that are used for incremental transfers, and it always selects the most recent source dataset for this changeset. This means that if you have snapshots A, B, C, and D on cluster 1, and snapshot A on cluster 2, then an incremental transfer will update cluster 2 to have snapshots A and D. It will not run a job to bring cluster 2 up to the state of snapshot B or snapshot C. It will run one job that updates cluster 2 to the latest content available on cluster 1.
  • Once a baseline is established with a dataset, incremental backups can occur between multiple DM to DM systems.
  • With the dataset model, a system can avoid performing a full copy of all the data (an initial sync/base copy) if there is a failover.
    • For example, if you have Cluster A that is doing a transfer to Cluster B and to a cloud platform, Datamover transfers those unique IDs. If both clusters and the cloud have the same root tree with a dataset in common, Datamover can perform incremental backups between those systems without performing a rebaseline.

Accounts

  • Accounts define connections to remote systems where the data is copied to. They define what systems are accessible and how you access them.
  • Account types can be File or Object.
  • Accounts consist of a URI, which is ideally a SmartConnect round robin DNS name for the remote cluster in a File type account. For example, you can specify a URI in the format of dm://remotenas.yourdomain.com:7722 (specifying the port is optional) for a DM to DM transfer between DNS systems. The hostname can also be an IPv4 address.
  • For object type accounts, you must provide a URI and correct object store type for working with the object store.
  • You can specify local and remote network pools defining nodes or interfaces to use for the data transfer, such as SmartConnect, DNS zones, IP addresses, and port number (standard TCP connection types).
  • SmartConnect pools allow administrators to include or exclude nodes, interface types, and networks.
    • Local network pool is an optional SmartConnect pool on the source cluster that is used to restrict what interfaces and nodes the local system uses when communicating with the remote target system.
    • Remote network pool is the interfaces and nodes that the remote target system uses when communicating with the local system. Note that this option is ignored for object-scale remote connections.
  • TLS certificate authentication and file system accounts: If you are connecting PowerScale to PowerScale clusters, with Datamover installed on each cluster, you must specify client and server certificates to enable transport encryption and transport layer security (TLS) certificate authentication.
    • Datamover uses mutual TLS certificate authentication as follows: Both the source and target systems present their Identity Certificates and validate those against the Certificate Authorities (CA) configured for use with Datamover when they perform a handshake. If either the source or target system cannot validate the other side's certificates, and then the handshake fails. If both sides successfully validate the other side's Identity Certificates against their saved Certificate Authorities, then both sides trust the other and the connection can continue.
  • Cloud accounts: When connecting to cloud accounts, specify a URI to connect to a cloud bucket. For example, you can specify a URI such as https://testecscluster.yourdomain.com:9002/cloudbucket. Specify credentials similarly to how you configure certificates in DNS.
    • When creating an account on a PowerScale cluster of type {AWS_S3 | ECS_S3 | AZURE}, credentials must be cloud authentication credentials. The access-id and secret-key parameters are specific to cloud accounts. The access-id parameter refers to the cloud account access identifier. The secret-key parameter refers to the cloud account secret key.

Policies

Policies specify what data to transfer, at what frequency, and what accounts are used. There are four types of policies:

  • Dataset creation policy is the process of creating the dataset.
  • Dataset copy policy is used for one-time data transfers.
  • Dataset repeat copy policy is used for repeated transfers.
  • Dataset expiration policy is how long the snapshot is stored.

Jobs

  • Datamover generates Jobs which are runtime entities created based on transfer policies and policy schedules.
  • There are two major types of data transfer jobs: baseline jobs for initial transfers and incremental jobs for subsequent transfers between FILE Datamover systems.

Tasks

  • Tasks are spawned by jobs and are the individual chunks of work that a job must perform. An example of a task is opening a file and transferring it.

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\