Skip to content

AI Coding Tools Glossary

84 terms defined. An authoritative reference for AI Coding Tools.

A

Agent Mode

A workflow where an AI assistant autonomously plans and executes multi-step tasks—reading files, running tests, making edits—with minimal human intervention. Agent mode goes beyond single completions to handle complex refactors.

Agentic Coding

An AI paradigm where the tool autonomously plans, executes, and iterates on coding tasks — reading files, running commands, fixing errors, and testing results in a loop. Examples: Claude Code, Cursor Composer agent mode. Represents the evolution from suggestion-based to autonomous AI assistance.

AI Code Review

The use of an AI assistant to automatically analyze pull requests or diffs for bugs, style violations, security issues, and logic errors. AI code review accelerates feedback cycles and catches issues before human reviewers.

AI Pair Programming

A development style where an AI assistant acts as the second programmer in a pair, offering real-time suggestions, explanations, and corrections. AI pair programming increases developer velocity and reduces context-switching.

AI PR Review

An automated process where an AI assistant analyzes a pull request diff and posts comments about bugs, style, security, and test coverage. AI PR review complements human reviewers and speeds up code quality feedback loops.

AI Test Generation

Using an AI model to automatically write unit, integration, or end-to-end tests for existing code. AI test generation increases coverage and reduces the manual effort required to write thorough test suites.

Air-Gapped Deployment

An AI installation with no network connectivity to external services, used in high-security environments. Air-gapped deployments require pre-downloaded model weights and prevent any data from leaving the facility.

API Key Management

The practices for securely storing, rotating, and scoping API keys used to authenticate with AI model providers. Poor API key management is a leading cause of unexpected billing charges and data exposure in AI-powered apps.

Attention Mechanism

The core operation in transformer models that lets each token attend to all other tokens in the context. Attention enables an AI coding tool to relate a variable declaration on line 1 to its usage on line 200.

B

C

Chat Mode

An interaction style where developers converse with an AI assistant in a back-and-forth dialogue to ask questions, debug code, or design solutions. Chat mode complements inline completion by handling open-ended queries.

CI/CD AI Integration

Incorporating AI-powered checks—such as automated code review, test generation, or security scanning—into a continuous integration and delivery pipeline. CI/CD AI integration catches issues automatically on every pull request.

Code Completion

An AI feature that predicts and suggests the next lines of code as you type. Modern tools use large language models to suggest entire functions, not just variable names. Accuracy depends on context quality — more open files and comments improve suggestions.

Code LLM

A large language model specifically trained or fine-tuned for programming tasks. Examples: CodeLlama, DeepSeek Coder, StarCoder, Codex. These models understand syntax, APIs, and programming patterns. They power the AI features in code editors and copilot tools.

Codebase Indexing

The process of scanning, parsing, and creating searchable representations of your entire project. Enables AI to answer questions about code it hasn't directly seen. Tools index file structure, function signatures, imports, and semantic content.

Context Length (Local Models)

The maximum number of tokens a locally run model can process in one request, which is often smaller than hosted models. Longer context lengths require proportionally more RAM and slow inference significantly.

Context Length Limit

The hard upper bound on how many tokens can be in a single model request, including the prompt and the generated output. Exceeding the limit requires truncation or summarization strategies to preserve key context.

Context Window

The maximum amount of text (measured in tokens) an AI model can process in a single request. Larger windows allow the AI to see more of your codebase. GPT-4o: 128K tokens. Claude: 200K tokens. Critical for understanding cross-file dependencies and large codebases.

Copilot

A general term (popularized by GitHub Copilot) for an AI assistant embedded in an IDE that suggests code, explains errors, and answers questions. Multiple vendors now offer copilot-style tools with varying model backends.

CPU Inference

Running LLM inference on a CPU rather than a GPU, typically 5–20x slower but accessible without specialized hardware. Tools like llama.cpp make CPU inference practical for smaller models on everyday laptops.

D

E

F

G

H

I

J

L

M

O

P

Q

R

S

T

Tab Completion

The action of pressing Tab (or a configured key) to accept an AI-generated inline suggestion. Tab completion workflows let developers move quickly through boilerplate while staying in their editor flow.

Telemetry

Usage data automatically collected by AI tools, such as accepted suggestions, latency metrics, and error rates. Developers should review telemetry settings to understand what data is shared with the vendor.

Temperature

A sampling parameter that controls how random the model's output is. Lower values (0.0–0.3) produce deterministic, focused code completions; higher values encourage more creative or varied suggestions.

Throughput

The total number of tokens a system can process per unit of time across all users or requests. High throughput matters for teams sharing a self-hosted AI service or enterprise deployments with many concurrent developers.

Token

The basic unit of text that AI models process. Roughly 1 token = 0.75 words or 4 characters in English. Code is less token-efficient than prose due to special characters and formatting. Pricing, context limits, and response times are all measured in tokens.

Token Budget

The planned allocation of tokens between system instructions, retrieved context, conversation history, and expected output within a model's context window. Managing the token budget prevents truncation and controls API costs.

Token Limit

The maximum number of tokens a model can process in a single request, covering both input and output. Hitting the token limit truncates context, which can cause an AI assistant to lose track of earlier code.

Tokenization

The process of splitting raw text or code into discrete units (tokens) before feeding them to a model. How a tokenizer splits code affects costs, context limits, and how well the model handles identifiers and symbols.

Tokens per Second

A benchmark measuring how many tokens an AI model generates each second during inference. Higher tokens-per-second means a more responsive coding assistant; typical consumer GPU setups achieve 30–120 t/s for 7B models.

Tool Use

The ability of an AI model to call external tools—web search, code execution, file I/O—during inference. Tool use extends an assistant beyond text generation to take real actions inside a development environment.

Top-K Sampling

A decoding method that restricts token selection to the K most likely next tokens at each step. It is often used alongside top-p to reduce nonsensical outputs in code generation.

Top-P (Nucleus) Sampling

A decoding strategy that limits token selection to the smallest set whose cumulative probability exceeds a threshold p. Top-P sampling helps AI tools balance diversity and coherence in generated code.

Training Data

The corpus of text and code used to teach a model its capabilities. The composition of training data heavily influences which languages, frameworks, and patterns an AI coding tool handles best.

Training Data Opt-Out

A provider option that prevents user prompts and completions from being used to train or improve future model versions. Many AI coding tool vendors offer opt-out settings for privacy-conscious users.

Transformer Architecture

The neural network design underpinning nearly all modern LLMs, using stacked self-attention layers and feed-forward networks. Almost every AI coding assistant—from GitHub Copilot to Claude—is built on a transformer.

V

Z