Performance

Pass@K Benchmark

An evaluation metric that measures the probability a model generates at least one correct solution within K attempts for a coding problem. Pass@1 tests whether the first suggestion is correct; higher K values measure coverage.

← Full glossary