Recall@K
Higher is better. Evaluated on private benchmarks.
Which models are
worse
better
than
?
Δ Recall@K vs baseline. Above zero is better.
Absolute
Relative
vs