Higher is better. Evaluated on private benchmarks across 15 languages.
Δ Recall@K vs baseline. Above zero is better.