Tags
1 page
KV Cache
How to Tune llama.cpp on 8GB VRAM: Why 32K Is Safer and 64K Needs KV Cache Quantization