Symptômes
Dell engineers have observed an infrequent issue during system operations, using the Dell PM1725a Express Flash NVMe PCIe SSD, in which the device may go offline and remain inaccessible. The drive may be accessible again after a reboot.
Errors such as "nvme_remove_namespaces," "nvme0n1: detected capacity change from xxxxxxxxx to 0" and various file system errors may be seen in /var, /log, /messages, or other system event logs. Also, continued use of the SSD would require re-creating the file system. This issue has been resolved with a firmware fix.
Cause
The issue is caused by a fault in cache management. When the write cache buffer is full, the controller erroneously drops an incoming WRITE operation from the host. This is resolved by servicing incoming WRITE operations from the host after the contents of the cache have been flushed.
Résolution
The issue has been resolved as of April 2019 by the Dell Express Flash NVMe PCIe SSD PM1725a firmware release, Version 1.1.2, A03.
NOTE: As all versions of PM1725a firmware prior to 1.1.2 are susceptible to this issue, firmware other than 1.1.2 has been removed from the Dell support website and is no longer available for download.
Produits concernés
Storage Spaces Direct R640 Ready Node, Storage Spaces Direct R740xd Ready Node, PowerEdge R640, PowerEdge R740, PowerEdge R740XD, PowerEdge R740XD2, PowerEdge R830, PowerEdge R840