Tag

#llama-cpp

Every story tagged llama-cpp, newest first.

How to choose the right quantization for a local LLM

Decode the Q4, Q5, and Q8 labels on model files, understand what bits-per-weight actually costs you, and pick a quantization that fits your RAM without wrecking quality.

Signal Desk · May 24, 2026 · 4 min read