Local AI

llama.cpp

A C/C++ inference engine for running LLMs efficiently on commodity hardware without a GPU. It powers many local AI tools and supports GGUF models with CPU and GPU offloading options.

← Full glossary