Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

Dell EMC Data Mover Admin Guide

PDF

Configure how processes are managed (Heavy workers and light workers)

To improve performance, you can tune DataIQ by changing the settings for heavy workers and light workers.

When a transfer job starts, its components are broken down into tasks. These tasks are managed by heavy worker and light worker processes.

Heavy workers typically handle tasks that involve moving and deleting files. By default, there are 10 heavy worker processes for the DataIQ host, and 10 more for each external worker node. In the Data Mover UI, processes usually refer to heavy worker processes.

Light workers typically manage tasks such as checking for connectivity and checking available space. In general, one light worker per node is enough to manage these types of tasks. Light worker processes do not tend to show up in the Data Mover UI.

When starting a job, users may be able to set the maximum number of heavy worker processes by using the Transfer dialog. (See the setting: force_max_workers_default for setting permission for non-admin users).

After a job is started, admins can change the maximum number of heavy worker processes on any job by using the View Transfers window. Users may also update this value when allow_throttling is set to True.

Tune performance for external workers nodes

CAUTION It is recommended to work with Dell Support when tuning process settings, because if these settings are misconfigured, they can cause DataIQ to stop working. These settings do not appear in the sample configuration file, though they can be added manually.

If your DataIQ external worker nodes are running on systems with greater than the minimum capacity, you may be able to get better performance by increasing the number of heavy worker processes on that node.

Example:

...

"Global Configurations":
  "Label Data Mover Service":
    num_heavy_workers:
      value: 5

...

  "Label Data Mover External Worker Overrides":
    "external-worker-1:num_heavy_workers":
      value: 15
    "external-worker-2:num_heavy_workers":
      value: 15

When you increase the number of heavy workers, data transfers should go faster. However, in some cases, DataIQ performance may slow down. When that happens, try the following methods to restore the performance to the DataIQ host:

  • If you are using external worker nodes, try decreasing the number of heavy workers created on the DataIQ host.
  • If you are using external worker nodes, try stopping file transfer services on the DataIQ host.
  • Decrease the number of heavy workers created, back toward the original levels.

In some cases, when you increase the number of heavy workers, DataIQ may stop altogether. When this happens:

  • Decrease the number of heavy workers in the DataIQ configuration file before restarting DataIQ.

    Edit the Data Mover configuration file on the DataIQ host at /opt/dataiq/maunakea/data/plugins/data-mover/.configs/ca.control. The changes to this file are read when DataIQ is restarted.

  • Restart DataIQ: dataiq init

Settings

From the web UI, select Settings > Data management configuration. In the Plugins section, select the (vertical ellipses) icon next to Data Mover, and select Edit configuration. The Data Mover configuration file opens.

Modify the following values:

Setting name Description
max_taskable_jobs

This option sets the maximum number of transfer jobs to be active at a time.

After the maximum number of transfer jobs has been reached, any new transfer jobs wait in a queue.

The default value is 5.

num_heavy_workers

This option sets the number of heavy worker processes to be created on the DataIQ host or on an external worker node.

The default value is 10.

Sometimes you can gain better performance by adding heavy worker processes. It is recommended to work with Dell Support when tuning DataIQ heavy worker performance, because if these settings are misconfigured, they can cause DataIQ to stop working.

This setting is not present in the sample configuration file, but can be added manually.

num_light_workers

This option sets the number of light worker processes to be created on the DataIQ host or on an external worker node.

The default value is 1. This value is almost always sufficient.

This setting is not present in the sample configuration file, but can be added manually.

prioritize_oldest_job This setting determines whether Data Mover dedicates the highest possible number of worker processes to the oldest active transfer job during its copy phase.
  • By default, or when set to False, Data Mover distributes workers to activate transfer jobs more evenly, not favoring one active transfer job over another.

  • When set to True, Data Mover favors sending work to the oldest job first. Limiting factors include when the job reaches the maximum number of read threads, write threads, or heavy workers. (See S3 endpoints: max_read_threads, S3 endpoints: max_write_threads, and the max_workers_default.)

    NOTE When the oldest job is large, several active transfer jobs may sometimes end up stuck in the copy phase, waiting for the oldest active transfer job to complete.
task_split_size

Transfer jobs are often split into separate tasks and handed off to worker processes. When multiple files are transferred, Data Mover assigns groups of files to Data Mover worker processes to perform the transfer. The maximum number of bytes each worker transfers can be adjusted with the task_split_size option. If a single file is greater than the task_split_size, a single worker will transfer that file.

The default value is 2048MiB (= 2 GiB). This value is expected to work for most use cases. If you frequently transfer many small files, use a smaller value (for example, 50MiB) to allow more workers to transfer files and speed up the total transfer time.

The value has the form <integer>[MiB]. (The MiB suffix is optional; the value is in units of MiB even when no suffix is given.)

When you are finished editing settings, select Save.

Example

"Global Configurations":
  "Label Data Mover Service":
    prioritize_oldest_job
      value: False
      default: False

    num_heavy_workers:
      value: 5

    max_taskable_jobs:
      value: 5

    task_split_size: 
      value: "2048MiB" 
      default: "2048MiB" 

  "Label Data Mover External Worker Overrides":
    "external-worker-1:num_heavy_workers":
      value: 15
    "external-worker-2:num_heavy_workers":
      value: 15

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\