Skip to content

What hardware do I need to run local LLMs for coding?

For a usable coding experience, you need at minimum 16GB of RAM to run a 7B parameter model. A modern Mac with Apple Silicon (M1/M2/M3) is excellent for local inference thanks to unified memory — an M2 Pro with 32GB RAM can run 13B models comfortably. For Windows/Linux, a GPU with 8–24GB VRAM (RTX 3080/4090) dramatically speeds up inference. CPU-only inference is possible but slow.