RepoWatch / GitHub signal

Llama.cpp b9439 release with iGPU default change

Foundational local inference updates that affect device handling in agent tooling.

Critical for local LLM inference in Hermes, OpenClaw, and similar agent systems running on GPU hardware.

What changed

  • New release tag: b9439 (published 2026-05-31)
  • Commit: “llama: only use one iGPU device by default (#23897)” (committed 2026-05-31)

Why it matters

Llama.cpp is the core engine for many local inference setups. The iGPU change alters default behaviour for multi-GPU machines — it now limits to a single device unless configured otherwise. This affects performance tuning and device selection in agent frameworks.

My read

A practical maintenance update rather than a headline feature. The release bundles the iGPU tweak along with other recent work. Relevant because OpenClaw and Hermes rely on llama.cpp derivatives for on-device inference.

Bottom line

Worth a spike. Update now if running llama.cpp-based stacks to validate the new default and test multi-GPU scenarios.