AI on a laptop? It's more realistic than you think

Sponsored Feature We have a lot to thank cloud computing for. It has enabled us to spin up instant workloads, given us ubiquitous access to data, and of course provided all the computing power we needed to realize the promise of AI. It's critical for modern computing, but thanks to a new generation of processors, it doesn't have to shoulder AI processing workloads alone.

There's a shift happening in personal computing thanks to the rise of on-device AI. That's comparable to previous computing revolutions including the cloud itself, and mobile computing. Upendra Kulkarni, VP of product management at Qualcomm Technologies Inc , likens it to the shift he saw as PCs replaced mainframes.

"We used to program on IBM 3090 and DEC VAX PDP-11 computers tucked away on a campus far away from you," he recalls. "You had in front of you a simple terminal, a display and a keyboard, and all the compute was somewhere else that you didn't see". Then came the PC, which localized a lot of the compute. The centralized computing resource didn't go away, but the PCs complemented it. On-device AI promises the same evolution for AI computing, he says.

AI experts have spent years telling us that these algorithms chew through so much computing power that you need large-scale hardware to manage them properly. How has the hardware evolved to the point where you can do it on a laptop computer? The answer lies both in hardware and software improvements.

First, the hardware. Snapdragon® X Series processors are a major catalyst for local on-device AI, thanks to their neural processing units (NPUs). These handle the integer and floating-point intensive operations necessary for neural network inferencing. Integrated directly into the processor's silicon as part of a system-on-chip design, they do what separate GPUs do on larger machines, but with far more performance-per-watt efficiency.

NPUs enable Snapdragon X Series processors to achieve Microsoft's 45 trillion operations per second (TOPS) performance benchmark for Copilot certification. In doing so, they unlock sophisticated local AI workloads previously only possible in the cloud.

The NPU in the Snapdragon X Series, which has the same specifications across all processors in the family, was tailored from the beginning to deliver a high performance per watt. Qualcomm Technologies says that its Snapdragon X Elite (X1E-80-100) processor can get up to 20% greater NPU performance than the Intel Core Ultra 7 256V Intel chips. This is especially useful for pervasive AI that must run continuously in the background.

"Our NPU is a 16k MAC engine, and that means it can do 32k operations," explains Kulkarni. MAC means multiply-accumulate, and it's a core part of the AI operations that the NPU is designed for, correlating directly with the multiplication of weights and input values that form the basis of neural networks. "It multiplies and adds in one clock cycle, and you will run that at roughly 1.4 gigahertz. So that's how you get to that 45 TOPS number," he says.

Vector transfer cache memory gets the data to the MAC engine quickly, eliminating bottlenecks in the processor's data pipeline. The processor family also prevents wasted clock cycles by using sophisticated instruction scheduling and optimizing data movement.

Running at higher efficiency on a local machine can also have a sustainability impact. AI's thirst for water when running in the cloud is well documented. Running locally reduces that problem, creating another impetus for the environmentally minded to consider a power-optimized NPU-accelerated setup.

The other advance that makes it possible to squeeze large language models (LLMs) and other AI algorithms into a laptop form factor is model optimization. AI experts have been making great strides to get these from lab-based concepts to commercial availability.

Kulkarni cites OpenAI's recently released 20 billion parameter model as an example of sophisticated LLMs that run locally. Another is the Microsoft Phi-4 silica optimized for edge deployment.

Reducing the parameter count from hundreds of billions to just a few billion is just one trick up computer scientists' collective sleeves. Another is quantization, which reduces the floating point precision of the models' calculations. INT8 and INT4 precision reduction techniques cut memory requirements by 50% and 75% respectively, and INT2 coming in near future will save even more.

"We play a big role in that, and we have put out quite a few sophisticated tools on our website," says Kulkarni. "We also have our own Qualcomm® AI hub where we post tools and models that are optimized for our platforms ."

Qualcomm Technologies has maintained model accuracy while reducing computational requirements by up to 90% using methods like these.

The other change is to the orchestration layer. Instead of running larger frontier-style LLMs that cover multiple functions, on-device AI follows an agentic architecture. It combines multiple models in concert to complete collections of tasks. Each model is optimized for a specific use case, and they can talk to each other using the Agent to Agent (A2A) protocol, or to other tools and databases using Model Control Protocol (MCP).

The speed advantage of local AI processing transforms enables new categories of applications by eliminating cloud round-trip latency. This reduces response times from seconds to milliseconds, explains Kulkarni, making real-time or near real-time processing feasible at the edge.

The advantage extends beyond low latency to predictability. Depending on the connection or what's happening with cloud infrastructure, latency can fluctuate over a cloud connection, leading to a less predictable user experience. That can be problematic, especially when processing latency-sensitive data such as audio and video.

Processing at the source of data generation also prevents unnecessary data movement, preserving overall network bandwidth. That might not be significant when dealing with a single PC, but hundreds or even thousands of enterprise users processing AI data locally can have tangible positive benefits on corporate networks, especially when dealing with WAN connections.

Use cases that benefit especially from local AI processing include always-on AI assistants. While these often run in the cloud, moving some or all of their processing to local processors enables them to respond more quickly to user needs. There are dozens of applications already taking advantage of the NPU for AI processing on the Snapdragon Series X processor family with more on the way.

Examples include Davinci Resolve, which applies AI features to its video editing capabilities on PCs using the chips. Djay Pro is AI-powered DJ software that uses the NPU for local music processing, while open-source imaging editing software GIMP uses AI to accelerate its capabilities. Various applications within Adobe Creative Suite are also optimized for NPU usage .

On-device AI doesn't have to be exclusive. Hybrid architecture balances local models like these with cloud-based AI. This enables vendors to balance AI computing for users between edge and cloud processing based on task requirements and data sensitivity. Software running on PCs that use Snapdragon Series X can choose where to run specific algorithms based on factors including compute requirement and latency.

Use cases benefiting from hybrid approaches include complex reasoning tasks. These can pre-process some data locally before handing off to the cloud, reducing the cloud's computational load.

While cloud service providers have honed their privacy and security, there will still be applications that need to support data sovereignty. "Most enterprises have their own private corporate data. That's their intellectual property, and nobody wants to put that in the cloud," Kulkarni says.

Financial institutions might want to keep customer data local, and healthcare providers might be similarly protective of patient data. Putting clear security boundaries in place such as local processing simplifies audit requirements while still giving workers the power of AI algorithms.

Privacy and data sovereignty are important factors for AI users, as regulatory frameworks like GDPR and HIPAA create legal requirements for local data processing. At the same time, the cost considerations of cloud-based AI at scale are becoming a factor for users.

Whether you're a developer generating code or a reporter filtering audio noise from a live interview, moving some or all AI processing to a local platform democratizes the technology and helps the environment. Instead of relying on power-hungry frontier models in the cloud, you can use power-sipping agentic AI to get jobs done faster, with fewer cycles. As more developers latch onto this concept, we'll see software increasingly taking advantage of this next stage in AI evolution.

Quick News Spot

AI on a laptop? It's more realistic than you think

POPULAR CATEGORY

misc

entertainment

corporate

research

wellness

athletics