# Experimental ROCm iGPU Support This branch adds a ROCm backend path geared toward AMD APUs that only expose a small VRAM aperture but share a large UMA pool with the CPU. The steps below outline how to reproduce the build and how to run Ollama with the staged ROCm runtime. > **Warning** > Upstream ROCm does not officially support these APUs yet. Expect driver updates, kernel parameters, or environment variables such as `HSA_OVERRIDE_GFX_VERSION` to change between releases. ## 1. Stage the ROCm runtime We avoid touching the system installation by unpacking the required RPMs into `build/rocm-stage`. ```bash mkdir -p build/rocm-stage build/rpm-tmp cd build/rpm-tmp dnf download \ hipblas hipblas-devel hipblas-common-devel \ rocblas rocblas-devel \ rocsolver rocsolver-devel \ rocm-hip-devel rocm-device-libs rocm-comgr rocm-comgr-devel cd ../rocm-stage for rpm in ../rpm-tmp/*.rpm; do echo "extracting ${rpm}" rpm2cpio "${rpm}" | bsdtar -xf - done ``` Important staged paths after extraction: | Purpose | Location | | ------------------------ | ----------------------------------------------- | | HIP/rocBLAS libraries | `build/rocm-stage/lib64` | | Tensile kernels (rocBLAS)| `build/rocm-stage/lib64/rocblas/library` | | Headers (`hip`, `rocblas`)| `build/rocm-stage/include` | ## 2. Build the ROCm backend Configure CMake with the preset that targets ROCm 6.x and point it at the staged HIP compiler: ```bash cmake --preset "ROCm 6" -B build/rocm \ -DGGML_VULKAN=OFF \ -DCMAKE_INSTALL_PREFIX=/usr/local \ -DCMAKE_HIP_COMPILER=/usr/bin/hipcc \ -DCMAKE_PREFIX_PATH="$PWD/build/rocm-stage" cmake --build build/rocm --target ggml-hip -j$(nproc) ``` Artifacts land in `build/lib/ollama/rocm` (and mirrored in `dist/lib/ollama/rocm` when packaging). These include `libggml-hip.so`, CPU fallback variants, Vulkan, and `librocsolver.so`. ## 3. Run Ollama on ROCm The runner needs to see both the GGML plugins and the staged ROCm runtime. The following environment block works for an AMD Radeon 760M with a UMA carve-out: ```bash export BASE=$HOME/ollama-gpu export OLLAMA_LIBRARY_PATH=$BASE/build/lib/ollama/rocm:$BASE/build/lib/ollama export LD_LIBRARY_PATH=$OLLAMA_LIBRARY_PATH:$BASE/build/rocm-stage/lib64:${LD_LIBRARY_PATH:-} export ROCBLAS_TENSILE_LIBPATH=$BASE/build/rocm-stage/lib64/rocblas/library export ROCBLAS_TENSILE_PATH=$ROCBLAS_TENSILE_LIBPATH export HSA_OVERRIDE_GFX_VERSION=11.0.0 # spoof gfx1100 for Phoenix export GGML_HIP_FORCE_GTT=1 # force GTT allocations for UMA memory export OLLAMA_GPU_DRIVER=rocm export OLLAMA_GPU=100 # opt into GPU-only scheduling export OLLAMA_LLM_LIBRARY=rocm # skip CUDA/Vulkan discovery noise export OLLAMA_VULKAN=0 # optional: suppress Vulkan backend $BASE/build/ollama serve ``` On launch you should see log lines similar to: ``` library=ROCm compute=gfx1100 name=ROCm0 description="AMD Radeon 760M Graphics" ggml_hip_get_device_memory using GTT memory for 0000:0e:00.0 (total=16352354304 free=15034097664) ``` If the runner crashes before enumerating devices: - Double-check that `ROCBLAS_TENSILE_LIBPATH` points to the staged `rocblas/library`. - Ensure no other `LD_LIBRARY_PATH` entries override `libamdhip64.so`. - Try unsetting `HSA_OVERRIDE_GFX_VERSION` to confirm whether the kernel patch is still needed on your system. > Example discovery + run log: [`docs/logs/rocm-760m-run.log`](logs/rocm-760m-run.log). The matching `curl` response is saved as [`docs/logs/rocm-760m-run-response.json`](logs/rocm-760m-run-response.json). ## 4. Sharing this build - Keep the staged RPMs alongside the branch so others can reproduce the exact runtime. - Include `/tmp/ollama_rocm_run.log` or similar discovery logs in issues/PRs to help maintainers understand the UMA setup. - Mention any kernel parameters (e.g., large UMA buffer in firmware) when opening upstream tickets.