为 Astra Machina 板构建 `llama-cpp-python wheel`

本教程将指导你在 Synaptics Astra Machina SL 系列开发板上构建并安装 llama-cpp-python wheel，从而通过 Python 绑定在本地使用 llama.cpp 进行大语言模型（LLM）推理。

先决条件

请确保你的 Astra Machina 板已刷入 Astra SDK OOBE 镜像。如未刷入，请升级到 OOBE 镜像。可从这里获取最新 SDK。

步骤详解

1. 克隆仓库及其子模块

git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python

--recurse-submodules 参数必须，用于拉取 llama.cpp 后端 C++ 代码。

2. 创建 Python 虚拟环境

python3 -m venv .venv 
source .venv/bin/activate

这样可以保持环境整洁，避免污染系统 Python。

3. 安装 Python 构建工具

pip install build wheel

这些工具用于从源码本地构建 Python wheel。

4. 为 Astra 优化构建 wheel

CMAKE_ARGS="-DLLAMA_NATIVE=ON" python -m build -w llama-cpp-python

该命令：

使用当前 CPU（Astra 上为 ARM Cortex）优化 llama.cpp 后端，带 -march=native
在 dist/ 目录下生成 .whl 文件

5. 安装构建好的 wheel

构建完成后，wheel 文件会位于 llama-cpp-python 文件夹下的 dist/ 目录。

示例：

ll llama-cpp-python/dist/
# 输出:
# llama_cpp_python-0.3.16-cp312-cp312-linux_aarch64.whl

pip install dist/llama_cpp_python-0.3.16-cp312-cp312-linux_aarch64.whl

这将安装本地编译的 Python 绑定及共享库。

构建后目录结构

llama-cpp-python/
├── llama_cpp/              ← Python 封装
├── vendor/llama.cpp/       ← 子模块（C++ 后端）
├── dist/                   ← 构建好的 wheel 文件
└── .venv/                  ← Python 虚拟环境

步骤详解​

1. 克隆仓库及其子模块​

2. 创建 Python 虚拟环境​

3. 安装 Python 构建工具​

4. 为 Astra 优化构建 wheel​

5. 安装构建好的 wheel​

构建后目录结构​

参考资料​