Open-source AI projects have democratized access to cutting-edge AI technologies, fostering collaboration and driving rapid progress in the field. Meta’s announcement of the release of Meta Llama 3 models marks a significant advancement in the open-source AI foundation model space. Llama 3 is an accessible, open large language model (LLM) designed for developers, researchers and businesses to build, experiment and responsibly scale their generative AI ideas. These latest generation LLMs build upon the success of the Meta Llama 2 models, offering improvements in performance, accuracy and capabilities.
Key advancements in Llama 3 include enhancements in post-training procedures, aimed at improving capabilities such as reasoning, code generation and following instructions. Additionally, improvements in the model architecture, such as an increased vocabulary size and a greatly improved tokenizer, enable more efficient language encoding. The input token context size has also been increased from 4K to 8K, benefiting use cases with large input tokens, such as RAG (retrieval-augmented generation).
Use Cases
Currently, four variants of Llama 3 models are available, including 8B and 70B parameter size models in pre-trained and instruction-tuned versions. Enterprises can leverage the open distribution and commercially permissive license of Llama models to deploy these models on-premises for a wide range of use cases, including chatbots, customer assistance, code generation and document creation.
Dell PowerEdge and Meta Llama models: A Powerhouse Solution for Generative AI
At Dell Technologies, we are very excited about our continued collaboration with Meta and the advancements in the open-source model ecosystem. By providing the robust infrastructure required to support the deployment and utilization of these large language models, Dell is committed to making it easier for customers to deploy LLMs on-premises through Dell Validated Designs for AI. We provide the optimal end-to-end integrated infrastructure solutions to fine tune and deploy these models within our customers’ own IT environment without having to send sensitive data to the cloud or having to resort to using costly proprietary models and a closed ecosystem.
Dell’s engineers have been actively working with Meta to deploy the Llama 3 models on Dell’s compute platforms, including the PowerEdge XE9680, XE8640 and R760XA, leveraging a mix of GPU models. Since Llama 3 models are based on a standard decoder-only transformer architecture, they can be seamlessly integrated into customers’ existing software infrastructure, including inference frameworks such as TGI, vLLM or TensorRT-LLM.
In the coming weeks, Dell will provide test results, performance data and deployment recipes showcasing how easy it is to deploy Llama 3 models on Dell infrastructure and how the performance compares to Llama 2 models. This ongoing collaboration between Dell and Meta underscores the commitment to advancing the open-source AI ecosystem through community-driven innovation and empowering enterprises to harness the power of AI within their own IT environments.
Get Started with the Dell Accelerator Workshop for Generative AI
Dell Technologies offers guidance on AI target use cases, data management requirements, operational skills and processes. Our services experts work with your team to share our point of view on AI and help your team define the key opportunities, challenges and priorities.
Contact your Dell sales rep for a free half-day facilitated workshop.
Learn more about Dell AI Solutions here.