Symptoms
Data Domain 显示没有过期的快照。可以通过执行以下操作来验证这一点:
- 在 Avamar 应用工具节点上,通过运行
avmaint hfscreate
命令,并将字符串放在前面 /data/col1/avamar-
遵循:
示例:
avmaint hfscreate
1501099628
生成的 MTree 名称: /data/col1/avamar-1501099628
- 从 Data Domain 中,获取与 Avamar MTree 关联的快照列表:
snapshot list mtree /data/col1/<avamar-mtree-name>
示例:
snapshot list mtree /data/col1/avamar-1501099628
Snapshot Information for MTree: /data/col1/avamar-1501099628
----------------------------------------------
Name Pre-Comp (GiB) Create Date Retain Until Status
----------------- -------------- ----------------- ------------ ------
cp.20170802130330 501241.1 Aug 2 2017 09:04
cp.20170802131127 501355.0 Aug 2 2017 09:11
cp.20170803120133 503440.7 Aug 3 2017 08:02
cp.20170803120726 503554.7 Aug 3 2017 08:07
cp.20170804120142 496207.0 Aug 4 2017 08:02
cp.20170804120836 496321.0 Aug 4 2017 08:09
cp.20170805130259 523295.5 Aug 5 2017 09:03
cp.20170805130955 523409.5 Aug 5 2017 09:10
cp.20170806130127 541524.5 Aug 6 2017 09:01
cp.20170806130719 541638.5 Aug 6 2017 09:07
cp.20170807130120 438037.9 Aug 7 2017 09:01
cp.20170807130712 438151.9 Aug 7 2017 09:07
----------------- -------------- ----------------- ------------ ------
Snapshot Summary
-------------------
Total: 12
Not expired: 12
Expired: 0
在 Avamar 的 DDR 维护日志中可以看到以下事件
(/usr/local/avamar/var/ddrmaintlogs/ddrmaint.log*)
:
Aug 7 09:07:45 avamar ddrmaint.bin[122469]: Error: cp-delete::expire_checkpoint_snapshot - Failed to expire checkpoint: cp.20170806130719, ddr: dd.emc.com, ddr-index: 1, DDR result code: 5075, desc: the user has insufficient access rights
Aug 7 09:07:45 avamar ddrmaint.bin[122469]: Error: <4740>Datadomain checkpoint delete operation failed.
...
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Info: Data Domain configured in Stand-Alone mode.
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Info: cp-delete::execute_delete_cp - Deleting DDR Checkpoint for dpnid:<dpnid> on ddr:dd.emc.com cp: cp.20170805130259
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Info: Setting default storage unit to 'avamar-1501099628' for handle 1
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Warning: Calling DDR_EXPIRE_SNAPSHOT returned result code:5075 message:the user has insufficient access rights
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Error: cp-delete::expire_checkpoint_snapshot - Failed to expire checkpoint: cp.20170805130259, ddr: dd.emc.com, ddr-index: 1, DDR result code: 5075, desc: the user has insufficient access rights
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Error: <4740>Datadomain checkpoint delete operation failed.
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Info: ============================= cp-delete finished in 1 seconds
Aug 6 09:07:18 avamar ddrmaint.bin[176316]: Info: ============================= cp-delete cmd finished =============================
在 ddfs.info 文件中可以看到以下事件
(/ddr/var/log/debug/ddfs.info*)
在 Data Domain 上:
08/07 08:01:06.322 (tid 0x7fe7037cd930): ddboost-<avamar.emc.com-37933>: ddboost_api ERROR: ddp_snapshot_expire() failed for SUName avamar-1501099628, snapshot: cp.20170802130330, retention: -1, flags: 0 Err: 5075-Update retention of snapshot [cp.20170802130330] on Storage Unit [avamar-1501099628(nfs: Operation not permitted)
10/11 11:19:55.468 (tid 0x7f27454a63f0): ddboost-<avamar.emc.com-60566>: test-avamar.dell.emc.com Local Time: Wed Oct 11 11:19:55 2017
10/11 11:19:55.471 (tid 0x7f2cd150e830): OST_FH_PERM FAIL on storage-unit=avamar-1501099628 op=NFSPROC3_DDP_LOOKUP[27] client=test-avamar.dell.emc.com uid=500:uid or gid does not match
09/26 09:17:46.762 (tid 0x7f60d7a08410): ddboost-<avamar.emc.com-37933>: ddboost_api ERROR: ddp_snapshot_list() failed, Err: 5009-Get snapshot list on Storage Unit [avamar-1501099628] failed (nfs: I/O error)
09/26 09:18:46.472 (tid 0x7f7e1ab92d50): ddboost-<avamar.emc.com-37933>: ddboost_api ERROR: ddp_snapshot_list() failed, Err: 5009-Get snapshot list on Storage Unit
Cause
特定存储单元的 DD Boost 用户是在以前版本的 DDOS 中创建的,操作系统组为
users
,而不是
admin
。当 Avamar 首次连接到 Data Domain 并创建存储单元时,会发生这种情况,并且设置与以下类似的权限:
drwxr-xr-x 7 <ddboostuser> users 430 Apr 26 2016 <avamar-mtree-name>
从 DDOS 6.1.x 开始,某些存储单元操作(例如删除存储单元或使快照过期)要求存储单元的所有者是
admin
组(而不是
users
组)。
如果不是这样,这些操作将失败,并且每天会留下两个与 Avamar 检查点对应的新 Data Domain 快照,而不会过期。
如果与存储单元关联的 DD Boost 用户更改为
admin
之后的角色。
提醒:到目前为止,此问题只会影响 Avamar,因为其他备份应用程序没有检查用户组和权限的操作。
Resolution
通过在 Data Domain 上执行以下操作来检查注册表设置:
- 检索 Avamar
DDBoost user
ddboost storage-unit show
Name Pre-Comp (GiB) Status User Report Physical
Size (MiB)
----------------- -------------- ------ --------------- ---------------
avamar-1501099628 10808220.0 RW ddboost-avamar -
d025 457051.7 RW ddboost-avamar -
rman_dd 240902.7 RW ddboost-rman -
mssql 142474.8 RW ddboost-avamar -
----------------- -------------- ------ --------------- ---------------
在此输出中,Avamar DD Boost 用户为 ddboost-avamar
。
- 检索并记下 DD Boost 用户的用户 ID (UID):
user show list
示例:
User list from node "localhost".
Name Uid Role Last Login From Last Login Time Status Disable Date
-------- --- ----- --------------- ------------------------ ------- ------------
sysadmin 100 admin 10.10.40.59 Tue Oct 10 14:44:52 2017 enabled never
ddboost-avamar 500 admin 10.10.40.55 Wed Oct 11 11:07:49 2017 enabled never
-------- --- ----- --------------- ------------------------ ------- ------------
2 users found.
在此实例中,UID 为 500 (但这可能会有所不同)。
- 运行以下命令以检查注册表设置:
reg show protocol.ost
admin 组的组 id (GID) 为 50,而用户组的组 id 为 100(不要与 sysadmin 的 UID 100 混淆)
注册表设置不正确(组 ID 为 100)的示例:
reg show protocol.ost
...
protocol.ost.stu_user.<avamar-mtree-name>= 500:100
protocol.ost.uid500:100 = <ddboostusername>
protocol.ost.user.<ddboostusername> = 500:100
...
如果设置不正确(其中 GID 设置为 100),请引用此知识库文章
创建服务请求。
然后可以检视、验证和解决此问题。
Affected Products
Avamar
Products
Avamar, Avamar Server, Data Domain