Skip to content

Local AI

Context Length (Local Models)

The maximum number of tokens a locally run model can process in one request, which is often smaller than hosted models. Longer context lengths require proportionally more RAM and slow inference significantly.