Performance
Pass@K Benchmark
An evaluation metric that measures the probability a model generates at least one correct solution within K attempts for a coding problem. Pass@1 tests whether the first suggestion is correct; higher K values measure coverage.