Skip to content

Performance

Latency

The delay between submitting a prompt and receiving the first token of the model's response. Low latency is critical for inline code completion, where delays longer than ~100 ms disrupt developer flow.