#Pi5 with amdgpu

1 messages · Page 1 of 1 (latest)

junior hatch
#

First thing first, what it does not run is hass 😄 that is on a pi4 right beside of this rpikenstein creation.
I think I will get a 16G pi5 as the 8G is swapping a lot (I am pushing it with all services I put on it though 😄). I really would love to have only 1 Pi running all my always on stuff, but that is probably not possible!

We cant use rocm here, atleast not to my knowledge. That is a shame because it means that any program not buildable with vulkan support is off the table 😢 This means no ollama among other things, but we can use llama.cpp instead.

More detailed info can be found here: Testing AMD RX 9060XT 16G I use a 8G though

Hardware

  • Pi5 HAT poe+pci
  • PCIE Occulink
  • Pi5
  • PSU
  • GPU
  • Power to Pi5 is PoE (increases the heat for the pi, but even though its overclocked it stays ~65°-70° C)
  • USB-hub (cant recall brand atm) connected to one of the 3.0 ports
    • Google Coral for frigate recognition
    • SonOff stick for zigbee
  • A SSD (samsung something I thik) connected to the other USB 3.0 port

Docker compose

- zigbee2mqtt
- frigate (when not using the amdgpu)*
- wyoming-whisper-api-client
- OpenWeb

system.d

- llama.cpp server (when not using frigate)
#

Benchmarks

Command ./build/bin/llama-bench -m models/Llama-3.2-3B-Instruct-Q4_K_M.gguf -p 0 -n 128,256,512

Logs

ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV GFX1200) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
ggml_vulkan: 1 = V3D 7.1.7.0 (V3DV Mesa) | uma: 1 | fp16: 0 | bf16: 0 | warp size: 16 | shared memory: 16384 | int dot: 0 | matrix cores: none

Results

model size params backend ngl test t/s
llama 3B Q4_K - Medium 1.87 GiB 3.21 B Vulkan 99 tg128 65.56 ± 1.17
llama 3B Q4_K - Medium 1.87 GiB 3.21 B Vulkan 99 tg256 64.18 ± 0.42
llama 3B Q4_K - Medium 1.87 GiB 3.21 B Vulkan 99 tg512 61.99 ± 0.15
#

Problems I am facing (so far)

frigate

This is a vigilante that use the on-board GPU how ever I config it, hence I need to stop it when using llama.cpp. I use a coral for OD and GPU for ffmpeg-hwaccel but using the preset-vaapi flag makes frigate fail to start and the -hwaccel drm uses on-board and kills the system when used at the same time as llama.cpp
This makes the system swap out and die, all cores spiked with frigate services and they kill the GPU so reboot and sometimes even a hard reset (pull all plugs) is needed in order for the system to see the GPU again.

whisper.cpp

cmake -B build -DGGML_VULKAN=1 && cmake --build build -j --config Release

...
/home/***/whisper.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5272:31: error: invalid initialization of reference of type 'const vk::DispatchLoaderDynamic&' from expression of type 'vk::detail::DispatchLoaderDynamic'
 5272 |     subctx->s->buffer.dispatch(wg0, wg1, wg2);
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/usr/include/vulkan/vulkan_funcs.hpp:4525:113: note: in passing argument 4 of 'void vk::CommandBuffer::dispatch(uint32_t, uint32_t, uint32_t, const Dispatch&) const [with Dispatch = vk::DispatchLoaderDynamic; uint32_t = unsigned int]'
 4525 | int32_t groupCountX, uint32_t groupCountY, uint32_t groupCountZ, Dispatch const & d ) const VULKAN_HPP_NOEXCEPT
      |                                                                  ~~~~~~~~~~~~~~~~~^

gmake[2]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/build.make:223: ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:1414: ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2