In deep learning applications, FPGA accelerators offer unique advantages for certain use cases.
In artificial intelligence applications, including machine learning and deep learning, speed is everything.
Whether you’re talking about autonomous driving, real-time stock trading or online searches, faster results equate to better results.
This need for speed has led to a growing debate on the best accelerators for use in AI applications. In many cases, this debate comes down to a question of server FPGAs vs. GPUs — or field programmable gate arrays vs. graphics processing units.
To see signs of this lively debate, you need to look no further than the headlines in the tech industry. A few examples that pop up in searches:
- “Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Learning?”
- “FPGA vs GPU for Machine Learning Applications: Which One Is Better?”
- “FPGAs Challenge GPUs as a Platform for Deep Learning”
So what is this lively debate all about? Let’s start at the beginning. Physically, FPGAs and GPUs often plug into a server PCIe slot. Some, like the NVIDIA® Volta Tesla V100 SXM2, are mounted onto the server motherboard. Note that GPUs and FPGAs do not function on their own without a server, and neither FPGAs nor GPUs replace a server’s CPU(s). They are accelerators, adding a boost to the CPU server engine. At the same time, CPUs continue to get more powerful and capable, with integrated graphics processing. So start the engines and the race is on between servers that have been chipped, turbo and supercharged.
FPGAs can be programmed after manufacturing, even after the hardware is already in the field — which is where the “field programmable” comes from in the field programmable gate array (FPGA) name. FPGAs are often deployed alongside general-purpose CPUs to accelerate throughput for targeted functions in compute- and data-intensive workloads. They allow developers to offload repetitive processing functions in workloads to rev up application performance.
GPUs are designed for the types of computations used to render lightning-fast graphics — which is where the “graphics” comes from in the graphics processing unit (GPU) name. The Mythbusters demo of GPU versus CPU is still one of my favorites and it’s fun that the drive for video game screen-to-controller responsiveness impacted the entire IT industry, as accelerators have been adopted for a wide range of other applications ranging from AutoCAD and virtual reality to crypto-currency mining and scientific visualization.
FPGA and GPU makers continuously compare against CPUs, sometimes making it sound like they can take the place of CPUs. The turbo kit still cannot replace the engine of the car — at least not yet. However, they want to make the case that the boost makes all the difference. They want to prove that the acceleration is really cool. And it is, depending on how fast you want or need your applications to go. And just like with cars, it comes at a price. After the acquisition cost, the price includes the amount of heat generated (accelerators run hotter), fuel required (they need more power), and sometimes applications aren’t programmed to take full advantage of the available acceleration (GPU applications catalog).
So which is better for AI workloads like deep learning inferencing? The answer is: It depends on the use case and the benefits you are targeting. The ample commentary on the topic finds cases where FPGAs have a clear edge and cases where GPUs are the best route forward.
Dell distinguished engineer Bhavesh Patel addresses some of these questions in a tech note exploring reasons to use FPGAs alongside CPUs in the inferencing systems used in deep learning applications. A bit of background: When a deep learning neural network has been trained to know what to look for in datasets, the inferencing system can make predictions based on new data. Inferencing is all around us in the online world. For example, inferencing is used in recommendation engines — you choose one product and the system suggests others that you’re likely to be interested in.
In his tech note, Bhavesh explains that FPGAs offer some distinct advantages when it comes to inferencing systems. These advantages include flexibility, latency and power efficiency. Let’s look at some of the points Bhavesh makes:
Flexibility for fine tuning
FPGAs provide flexibility for AI system architects looking for competitive deep learning accelerators that also support customization. The ability to tune the underlying hardware architecture and use software-defined processing allows FPGA-based platforms to deploy state-of-the-art deep learning innovations as they emerge.
Low latency for mission-critical applications
FPGAs offer unique advantages for mission-critical applications that require very low-latency, such as autonomous vehicles and manufacturing operations. The data flow pattern in these applications may be in streaming form, requiring pipelined-oriented processing. FPGAs are excellent for these kinds of use cases, given their support for fine-grained, bit-level operations in comparison to GPUs and CPUs.
Power savings
Power efficiency can be another key advantage of FPGAs in inferencing systems. Bhavesh notes that since the logic in FPGAs has been tailored for specific applications and workloads, the logic is extremely efficient at executing the application. This can lead to lower power usage and increased performance per watt. By comparison, CPUs may need to execute thousands of instructions to perform the same function that an FPGA maybe able to implement in just a few cycles.
All of this, of course, is part of a much larger discussion on the relative merits of FPGAs and GPUs in deep learning applications — just like with turbo kits vs. superchargers. For now, let’s keep this point in mind: When you hear someone say that deep learning applications require accelerators, it’s important to take a closer look at the use case(s). I like to think about it as if I’m chipping, turbo or super-charging my truck. Is it worth it for a 10-minute commute without a good stretch of highway? Would I have to use premium fuel or get a hood scoop? Might be worth it to win the competitive race, or for that muscle car sound.
Ready to learn more? Check out Bhavesh Patel’s high-level Tech Talk on Inferencing Using FPGAs and his deeper-dive tech note on the same topic, Where the FPGA Hits the Server Road for Inference Acceleration.