Unsolved
1 Rookie
•
2 Posts
0
30
NVDIA GPU is showing unavailable in the BIOS of Dell R640 Servers
Hello Team,
.
We have used Dell R640 server for Bare Metal EKS Anywhere cluster deployment on which we got NVDIA GPU Tesla T4.
Just before EKS-A cluster deployment all the 7 servers are having NVDIA GPU Tesla T4 installed and we were able to see in IDRAC console active. After the EKS-A cluster. All the 7 nodes are showing NVDIA GPU Tesla T4 unavailable. We have made no change any BIOS setting to get this lost.
We need your help to identify the issue as cluster deployment for GPU-Operator stuck due to this.
Version: v0.19.6
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.13-eks-4b9e40a
DELL-Chris H
Moderator
Moderator
•
8.8K Posts
0
July 22nd, 2024 14:29
Pkumar5,
I am not very familiar with EKS Anywhere, but from the page here it looks like it is only certified for use on the Vxrail, and the Powerstore lines. As far as the Tesla T4, if you let me know the part number on the GPU itrself, I can let you know if it is compatible.
Let me know what you see.
pkumar5
1 Rookie
1 Rookie
•
2 Posts
0
July 23rd, 2024 08:23
Thank you Chris for sharing update on this.
We are using R640 for POC cluster.
The part number of GPU is 1EB8-895-A1
EKS Anywhere version details are below:-
Version: v0.19.6
Client Version: v1.30.1
Customize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.13-eks-4b9e40a
We need to know what driver can be suited for Ubuntu 22.04 image with respect to EKS-Anywhere cluster.
Looking forward some resolution on this.
Thank you
Pankaj Kumar
Dell-Martin S
Moderator
Moderator
•
3.3K Posts
0
July 23rd, 2024 12:03
Hi,
this is not a DPN please check if the part was from Dell or if the number was the correct one.