By Marc Hammons, Senior Distinguished Engineer, Dell Technologies
In an age where technology evolves at a breakneck pace, the cloud has reigned supreme as the ubiquitous home of artificial intelligence (AI). You already use AI daily, often without even realizing it, as it works tirelessly in the vast expanse of the cloud. Just ask Alexa. While AI will certainly continue to flourish in the cloud, its future is on your laptop, desktop, and other personal computing devices.
Understanding On-Client AI
The evolution of AI processing from the cloud to the edge is a game-changing concept poised to redefine how you interact with and benefit from this powerful technology. This new paradigm—called on-client AI—involves the deployment and execution of AI processes directly on a user’s local computing device.
On-client AI brings massive processing capabilities closer to the end-user, enabling you to harness its power without relying on cloud computing infrastructure. This right-sizing of AI will make connected devices more powerful and more useful. The advantages of local AI processing include security, speed and cost.
Privacy and Security
Data generated using AI has historically been shuttled to and from the cloud. While this facilitates convenience and connectivity, it has also raised privacy and security concerns. On-client AI represents a fundamental shift towards a safer digital world. Local processing means you no longer need to worry about your personally identifiable information (PII) and intellectual property being exposed to potential threats lurking in the cloud. User consent and protection of PII are inherently local, and on-client AI promises to facilitate privacy monitoring and compliance.
Speed and Efficiency
Speed matters in the digital marketplace. On-client AI delivers nearly instantaneous results. From helping you organize your files to enhancing your camera’s capabilities, on-client AI has the potential to enable rapid results—without straining your device’s processing capabilities.
Cost Effectiveness
When it comes to data, every byte has a cost associated with it. Moving data to and from the cloud is expensive. While these data transfer and storage costs may seem minuscule on an individual level, they can add up on a larger scale. Local processing minimizes the need for data transfers, reducing the cost of cloud computing resources.
Developing Smarter Connected Devices
One of the primary goals of on-client AI is to make it possible to run large language models on smartphones and other mobile devices. AI-enabled mobile devices and personal computers will be a game changer for end users and developers. A new generation of smart PCs that will pack the processing power required to make on-client AI a reality is already on the horizon.
The technical advancements driving on-client AI have the potential to shift the deployment of massive AI models from the cloud to personal devices. This will give you the ability to run complex machine-learning algorithms on a laptop or smartphone on the go. Running a 7 billion-parameter large language model on a single GPU in a workstation can, for example, be a time-consuming and computationally expensive process. The option to run those large language models using on-client AI will revolutionize how users interact with their devices.
A number of different technologies make it possible to run large machine learning models and other AI applications on user devices. One of the most promising is known as “model distillation,” a technique for training a smaller model to mimic the behavior of a larger model. The smaller model can be deployed to the user’s device, while the larger model remains in the cloud.
Of course, on-client AI will complement, not replace, cloud processing.
The cloud has its strengths, including scalability, ease of access, and performance. But AI processing on local devices will overcome some of the limitations of cloud computing, such as privacy, data transfer costs, and latency. As on-client AI devices evolve, users will be able to switch between cloud-based and local processing, depending on specific workload needs.
New hardware and software architectures will be required for on-client AI processing. This includes new types of GPUs and CPUs that are specifically designed for running machine learning applications. Making on-client AI a reality will also require new software frameworks and platforms that facilitate the development of AI processing on local devices.
Supporting Next Generation Solutions
AI is already moving closer to the end user. The way silicon is used to support AI on devices, like generative AI, is changing in tandem with the improvement in AI algorithms. Continual advancements in on-chip processing power will support the next generation of client platforms and device ecosystems, in which AI processing is distributed across peripherals, endpoints, edge and cloud.
Qualcomm is already pioneering the reduction of computational resource requirements for on-device machine learning. In fact, a recent proof-of-concept demonstrated how Stable Diffusion, a large and processing-intensive application, could run on a smartphone.
Stable Diffusion, one of the best-known text-to-image models, was also optimized for generative applications by a team of developers on the Hugging Face platform. They reported using Intel’s OpenVino Toolkit to reduce the latency of Stable Diffusion models running on resource-constrained hardware, such as a CPU.
The development of client platforms with silicon built to handle AI processing will unlock a wide range of new applications, supporting everything from collaborative remote meetings to intelligent security solutions. Generative AI experiences—such as adaptive user interfaces and automated support—will all be part of the future of personal computing. This convergence of the cloud with local computing will enable AI-driven personalization, system optimization, and predictive actions that deliver seamless, enhanced user experiences.
These differentiated experiences represent just the beginning of the future of personal computing. As hardware that enables on-client acceleration of machine learning algorithms is introduced, AI models will become more efficient and capable of running on conventional connected devices. Get ready for a progression of new on-client AI applications, from conceptual to mainstream, that will redefine the individual computing experience.
Learn more about sourcing, deploying, and managing infrastructure designed for the AI era.