Thumbnail image

Local LLM & AI on AMD Radeon RX 9070 XT With Ollama/Windows

Ollama is an open-source platform for running local large language models (LLMs) such as gpt-oss or qwen3. It supports GPU acceleration using ROCm, which is AMD’s open-source software platform designed for AI on AMD platforms. Unfortunately, the process of enabling Ollama to run AI on Windows 11 with AMD Radeon RX 9000 series GPUs (like the RX 9070 XT) is not straightforward. This guide will walk you through the steps to install the necessary libraries and get your local LLMs running using Ollama on an AMD platform.

The Missing Piece: Ollama, ROCm, and AMD Radeon RX 9000 Series GPUs for AI on AMD

Ollama version 0.16.1 does not yet support the AMD Radeon RX 9000 series because it uses ROCm version 6.1. If you have Ollama already installed, you will find the installed ROCm libraries for supported GPUs in the following directory:

C:\Users\<USER>\AppData\Local\Programs\Ollama\lib\ollama\rocm\rocblas\library.

Here, you can see libraries for supported GPUs, such as those with a gfx1100 suffix for AMD Radeon RX 7900 XTX.

Missing ‘gfx1200’ Libraries in Ollama ROCm Library

The gfx12xx libraries required for running AI on AMD RX 9000 series GPUs are missing, such as gfx1201 for AMD Radeon RX 9070 XT. Follow these steps to get gpt-oss or qwen3 running on your AMD GPU.

Step 1: Get the Missing Pieces - Download the Latest AMD HIP SDK with Required gfx120x Libraries

The Heterogeneous-computing Interface for Portability (HIP) SDK provides a subset of ROCm libraries for Windows platforms. Since ROCm version 6.3, the HIP SDK includes the gfx120x libraries, which allow you to run large language models like gpt-oss or qwen3 on your GPU. I tested version 7.1.1 on my setup, which you can download from the AMD ROCm Hub.

Step 2: Install AMD HIP SDK & ROCm

Now install the AMD HIP SDK to your machine. By default, you should find the gfx12xx libraries in

C:\Program Files\AMD\ROCm\7.1\bin\rocblas\library.

Next, we want Ollama to use these instead of the default ones.

Step 3: Set the ROCBLAS_TENSILE_LIBPATH Environment Variable

We can use the ROCBLAS_TENSILE_LIBPATH variable to point Ollama to the newly installed libraries. All you need to do is set this environment variable for your user in Windows.

Add ROCBLAS_TENSILE_LIBPATH Variable to Windows User Environment Pointing to ROCm 7.1.1 Libs

Step 4: Start Ollama and Run Your Large Language Model as gpt-oss or qwen3

Now you can install Ollama if it is not already on your machine. If it is already running, stop it and start it again. Check the logs to ensure Ollama has found your GPU: C:\Users\<USER>\AppData\Local\Ollama\server.log. You should now see your AMD Radeon RX 9070 XT being added as a compute device:

time=2026-02-15T12:02:34.656+01:00 level=INFO source=routes.go:1636 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:V:\\Program Files\\ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:]"
time=2026-02-15T12:02:34.684+01:00 level=INFO source=images.go:473 msg="total blobs: 10"
time=2026-02-15T12:02:34.684+01:00 level=INFO source=images.go:480 msg="total unused blobs removed: 0"
time=2026-02-15T12:02:34.685+01:00 level=INFO source=routes.go:1689 msg="Listening on 127.0.0.1:11434 (version 0.16.1)"
time=2026-02-15T12:02:34.686+01:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-02-15T12:02:34.693+01:00 level=INFO source=server.go:431 msg="starting runner" cmd="C:\\Users\\<USER>\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 54950"
time=2026-02-15T12:02:35.189+01:00 level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-02-15T12:02:35.190+01:00 level=INFO source=server.go:431 msg="starting runner" cmd="C:\\Users\\<USER>\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 54957"
time=2026-02-15T12:02:35.282+01:00 level=INFO source=server.go:431 msg="starting runner" cmd="C:\\Users\\<USER>\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 54964"
time=2026-02-15T12:02:35.412+01:00 level=INFO source=server.go:431 msg="starting runner" cmd="C:\\Users\\<USER>\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 54971"
### Below is what we are looking for
time=2026-02-15T12:02:36.140+01:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1201 name=ROCm0 description="AMD Radeon RX 9070 XT" libdirs=ollama,rocm driver=60551.38 pci_id=0000:2f:00.0 type=discrete total="15.9 GiB" available="13.2 GiB"
### Above is what we are looking for
time=2026-02-15T12:02:36.140+01:00 level=INFO source=routes.go:1739 msg="vram-based default context" total_vram="15.9 GiB" default_num_ctx=4096

Now, if you run Ollama, you should also see that your model is allocated to the GPU:

> ollama ps
NAME           ID              SIZE     PROCESSOR    CONTEXT    UNTIL
gpt-oss:20b    17052f91a42e    14 GB    100% GPU     8192       29 minutes from now

Conclusion

Now you know how to make large language models run on Windows 11 and AMD Radeon RX 9000 series GPUs with Ollama. I hope this guide is helpful for you!