2 posts tagged with "llama" | Edge AI for Developers

Llama 3.2 for AI Assistants on Edge Devices

September 28, 2024 · 3 min read

Astra Team

Synaptics

Meta recently announced Llama 3.2 with new lightweight text-only models (1B and 3B) designed specifically for edge devices! These include both pre-trained and instruction-tuned versions with a 128K token context length.

LLMs Optimized for the Edge

July 12, 2024 · 4 min read

Meet Patel

MLE @ Synaptics

In the evolving landscape of natural language processing, LLMs (Large Language Models) and SLMs (Small Language Models) have emerged as powerful tools for various applications, from chatbots to text completion. Running Llama.cpp on embedded systems with the Astra Machina development kit unlocks new potential for deploying localized, efficient AI solutions, ideal for edge computing environments. In this blog, you will learn the high-level approach to bring llama.cpp to life on the Astra Machina development kit, enabling advanced LLM capabilities directly on-device.

Llama.cpp makes it significantly easier to run Llama and other supported models on edge devices or a wide variety of hardware locally. Its lightweight design is optimized, allowing deployment without the need for powerful GPUs or cloud infrastructure. By utilizing CPU inference, Llama.cpp enables models to run efficiently. It also has support to run on GPU. Developers can leverage OpenCL to run it on GPU for Machina.