Dell and Meta Collaborate to Drive Generative AI Innovation

Building an on-premises open-source strategy deploying Llama 2 on Dell Technologies infrastructure.

By Shivani Agarwal | October 31, 2023October 30, 2023

Organizations are increasingly taking a human-centered approach to adopting Generative AI models, enabling their teams to innovate and increase productivity. Use cases such as chatbots, virtual assistants and code development require enterprises to train models on their own proprietary data. Public cloud solutions can create uncertainty with data sovereignty concerns, security challenges and unpredictable costs. On-premises deployments using open-source large language models such as Llama 2 offer customers better value over time with predictable costs and complete control over their data, reducing risk to security and IP leakage and ensuring compliance with regulations.

Dell Generative AI Solutions, with the Dell Validated Design for Generative AI, offer customers a ready-made foundation of pre-tested software and hardware, purpose-built for GenAI efforts. And we’re now expanding our portfolio of GenAI offerings in collaboration with Meta by making it easy for Dell customers to deploy Meta’s Llama 2 AI models on-premises with Dell’s GenAI portfolio of IT infrastructure.

The Dell Validated Design for Generative AI with Meta’s Llama 2 provides pre-tested and proven Dell infrastructure, software and services to streamline deployment and management of on-premises projects. Llama 2 is free for research and commercial use and is tested and verified on the Dell Validated Design for inferencing and fine-tuning. With fully documented deployment and configuration guidance, organizations can get their GenAI infrastructure up and running quickly and operate Llama 2 with more predictability. Customers can deploy Llama 2 for fine tuning on an on-premises data center and analyze memory utilization, latency and efficiency of a large language model (LLM) using a Dell platform. Our goal is to be the preferred on-premises infrastructure provider for customers deploying Llama 2 and bring the best-of- breed generative AI solutions to our customers.

Deploying Llama 2 on Dell PowerEdge Servers

By running models on-premises, enterprises can guarantee uninterrupted functionality and ensure intellectual property protection. The Dell PowerEdge XE9680 is the ideal server for fine-tuning and deploying a LLM such as the Llama 2. It is the industry’s first server to ship with eight NVIDIA H100 GPUs and NVIDIA AI software. Llama 2 can also be deployed on a variety of Dell infrastructure including the PowerEdge R760xa with NVIDIA L40S.

An exciting area of research in the generative AI domain is the area of model customization. Llama 2 models make it easier to fine tune models by publishing recipes than can easily be applied to Dell Generative AI Solutions. Dell has carried out significant technical investigation and validation to demonstrate the application of state-of-the-art customization techniques such as supervised fine tuning (SFT), LoRA and p-tuning to Llama 2 7B, 13B and 70B models. We deployed the Llama 2 70B model with Ubuntu 22.0 operating system and the required NVIDIA GPUs on PowerEdge XE9680.

The table shows chat outputs for three Llama models:

Model	Input prompt: What is the capital of Texas?
llama-2-7b-chat	Assistant: Thank you for asking! The capital of Texas is Austin. I’m glad to help! However, I must point out that the question you asked is not factually coherent as Texas does not have a capital. Texas is a state located in the southern United States and does not have a capital city. I hope this clears up any confusion. Is there anything else I can help you with?
llama-2-13b-chat	Assistant: Hello! I’m happy to help! The capital of Texas is Austin. It’s a great city with a rich history and a vibrant cultural scene. If you have any other questions, please feel free to ask!
llama-2-70b-chat	Assistant: The capital of Texas is Austin.

As we can see, the output with the Llama 2 70B model with the required GPUs on PowerEdge XE9680 generates a concise and precise answer.

More information, including process steps, reference infrastructure designs and validation results, is available in the Dell Validated Design for Generative AI design guide.

With Meta’s Llama 2 and the breadth of Dell’s Generative AI Solutions, technology and services portfolio, organizations of all sizes have access to reliable tools to deliver GenAI solutions from desktops to core data centers, edge locations and public clouds. Dell has also integrated Llama 2 models into its internal sizing tools to help guide customers to the right solution to power their Llama 2-based AI solutions.

Get Started with the Dell Accelerator Workshop for Generative AI

Dell Technologies offers guidance on GenAI target use cases, data management requirements, operational skills and processes. Our services experts work with your team to share our point of view on GenAI and help your team define the key opportunities, challenges and priorities. Contact your Dell sales rep for a free-half day facilitated workshop.

Learn more about Dell Generative AI Solutions here and check out the following resources about Llama 2 deployments on Dell Technologies infrastructure:

White Paper: Llama 2: Inferencing on a Single GPU
Blog Post: Llama 2: Efficient Fine-tuning Using Low-Rank Adaptation (LoRA) on Single GPU
Blog Post: Deploying Llama 2 on the Dell PowerEdge XE9680 Server

Welcome

Welcome to Dell

Deploying Llama 2 on Dell PowerEdge Servers

Get Started with the Dell Accelerator Workshop for Generative AI

About the Author: Shivani Agarwal