What's the problem?
Your system has a Strix Halo GPU (gfx1151), but your Linux kernel is missing a critical fix that exports CWSR (Context Wave Save/Restore) properties. Without this fix, ROCm may have incorrect VGPR counts, causing crashes for various workloads.
Technical details:
- Strix Halo optimization guide
- The fix is included in kernel 6.18.4+ but may be backported to older kernels
How to fix this
Install the OEM Kernel
Ubuntu provides an OEM kernel with the fix included. Open a terminal and run:
sudo apt update && sudo apt install linux-oem-24.04
After installation, reboot your system for the new kernel to take effect.
Upgrade to a Kernel with the CWSR Fix
Upgrade your kernel to version 6.18.4 or later, or a kernel with the CWSR fix backported. The fix exports cwsr_size and ctl_stack_size properties to userspace.
Check Current Kernel Version
Open a terminal and run:
uname -r
Update Your Kernel
Use your distribution's kernel update process. For example:
- Fedora/RHEL:
sudo dnf update kernel - Arch Linux:
sudo pacman -Syu linux - openSUSE:
sudo zypper update kernel-default
Reboot
Reboot your system to load the new kernel:
sudo reboot
Verify
After rebooting, verify the CWSR properties are exported:
grep -E "cwsr_size|ctl_stack_size" /sys/class/kfd/kfd/topology/nodes/*/properties
You should see both cwsr_size and ctl_stack_size listed.
Need Help?
If you continue to experience issues after updating your kernel:
- Check the GitHub Issues page
- Contact us on Discord