Performance
Latency
The delay between submitting a prompt and receiving the first token of the model's response. Low latency is critical for inline code completion, where delays longer than ~100 ms disrupt developer flow.
Performance
The delay between submitting a prompt and receiving the first token of the model's response. Low latency is critical for inline code completion, where delays longer than ~100 ms disrupt developer flow.