Session: Accelerating Generative AI on Arm CPUs, in the Cloud and in your Pocket
You don’t need power hungry GPUs for your AI workload! Even in the new world of Large Language Models (LLMs) and Generative AI, running AI models can be done efficiently on modern Arm CPUs. Advances in hardware instructions (Neon, SVE/2), ML libraries (Arm Compute Library, Kleidi AI), and the evolution of Small Language Models and quantization methods allow you to run your generative AI workloads on your own server or embedded into your smartphone app. This talk will show you how, with an eye towards making AI effective, affordable, and ubiquitous.
This session will be recorded