AMD Radeon RX 7600 (Navi 33)
PyTorch Setup on Ubuntu 25
A guide to Native Docker, ROCm (HIP), and GFX Masquerading.
This guide documents the successful setup of a local AI environment on Ubuntu 25 using Native Docker (not Docker Desktop) to run PyTorch with ROCm (HIP) acceleration on an RX 7600.
Note: Since the RX 7600 (gfx1102) is not officially supported by PyTorch binaries (which target gfx1100/RX 7900), this setup uses a masquerading technique via Docker to force compatibility.
1. Prerequisites & Host Setup
A. Driver Verification (Ubuntu 25)
Ubuntu 25's kernel (6.10+) includes the necessary drivers by default. Verify your kernel has loaded the compute drivers:
ls -l /dev/kfd /dev/dri
Success: You see /dev/kfd and /dev/dri/renderD128.
Troubleshooting: If /dev/kfd is missing, run sudo modprobe amdkfd.
B. User Permissions
Your user must belong to the render, video, and docker groups.
sudo usermod -aG docker,video,render $USER
Log out and log back in (or reboot) to apply changes.
C. Docker Setup (Native)
Uninstall Docker Desktop if present (it blocks hardware passthrough) and install Native Docker.
# Remove Docker Desktop sudo apt-get remove docker-desktop rm -r $HOME/.docker/desktop mv ~/.docker ~/.docker_backup # Install Native Engine sudo apt update sudo apt install docker.io docker-compose-v2 docker-buildx-plugin
2. Project Configuration
Create a workspace:
mkdir -p ~/rocm-lab/workspace cd ~/rocm-lab
File 1: Dockerfile
This image extends the official ROCm PyTorch image and adds the gfx1100 override.
# Base image: Official ROCm 6.0 with PyTorch 2.1.1
FROM rocm/pytorch:rocm6.0_ubuntu22.04_py3.9_pytorch_2.1.1
# --- CRITICAL OVERRIDE ---
# Masquerade RX 7600 (gfx1102) as RX 7900 (gfx1100)
ENV HSA_OVERRIDE_GFX_VERSION=11.0.0
# Install system dependencies
RUN pip install --upgrade pip && \
pip install transformers accelerate streamlit sentence-transformers pandas scikit-learn jupyterlab
# --- BITSANDBYTES FIX ---
# Required for loading 4-bit/8-bit LLMs on ROCm
RUN pip install https://github.com/ROCm/bitsandbytes/releases/download/rocm_enabled_beta_0.61.0/bitsandbytes-0.61.0+rocm6.0-py3-none-any.whl
WORKDIR /app
EXPOSE 8888 8501
CMD ["tail", "-f", "/dev/null"]
File 2: docker-compose.yml
Handles hardware passthrough and security privileges.
services:
lab:
build: .
container_name: rocm-rx7600
# --- HARDWARE ACCESS ---
privileged: true
security_opt:
- seccomp:unconfined
cap_add:
- SYS_PTRACE
# Device Mapping
devices:
- "/dev/kfd:/dev/kfd"
- "/dev/dri:/dev/dri"
# Group Passthrough
group_add:
- video
- render
ipc: host
volumes:
- ./workspace:/app
ports:
- "8888:8888"
- "8501:8501"
command: tail -f /dev/null
3. Launching and Verification
Start the Lab:
docker compose up -d --build docker exec -it rocm-rx7600 bash
Verify Hardware Visibility:
ls -l /dev/kfd rocm-smi
The Final Test (Python):
python3 -c "import torch; print(f'Device: {torch.cuda.get_device_name(0)}'); x = torch.randn(1024, 1024).cuda(); print(x @ x)"
Should print the device name and a tensor of numbers without error.
4. Troubleshooting
Issue: GPU: False / No HIP GPUs
Cause: Container started before permissions applied or host /dev/kfd blocked.
Fix: Run sudo usermod -aG render $USER. Ensure privileged: true is in compose file. Restart container.
Issue: ls: cannot access '/dev/kfd' (Inside container)
Fix: Ensure security_opt: seccomp:unconfined is in your compose file.
Issue: jupyter: command not found
Fix: Rebuild with docker compose build --no-cache.