Tournament-Based Performance Evaluation and Systematic Misallocation

Executive Summary

Using agent-based simulation with 994 engineers across 142 teams of 7, this research demonstrates that tournament-based forced ranking produces systematic misallocation even under idealized conditions. With random team assignment, 32% of termination and promotion decisions are incorrect. Under realistic conditions where team quality varies (reflecting differential managerial capability), error rates reach 53%—meaning incorrect decisions outnumber correct ones.

These are not failures of execution. They are mathematical necessities. Forced ranking assumes small teams are representative samples of global talent distribution—a classic "law of small numbers" fallacy. The system punishes well-run teams most severely because talent clustering amplifies statistical error.

Proposed remedies fail:

Calibration sessions: Spread awareness of error but cannot fix underlying statistical problems
Grade inflation prevention: Mathematically equivalent to the original problem
Cross-team comparison: Creates new categories of error while eliminating old ones
Manager training: Improves within-team accuracy but cannot address between-team variance

The Statistical Problem

Forced ranking makes an implicit assumption: that small teams are representative samples of the organization's overall talent distribution. This assumption is false under conditions that universally obtain in practice.

When managerial capability varies (which it always does), strong managers attract and retain stronger performers. Teams are not random samples—they are clustered by managerial quality. A high performer on a strong team will rank lower than a mediocre performer on a weak team.

This is not a cultural problem or an execution problem. It is a mathematical certainty that cannot be designed away within the forced-ranking paradigm.

The Adverse Selection Spiral

The simulation shows that forced ranking systems, applied over multiple cycles, produce adverse selection spirals: high performers on strong teams exit first (because they recognize the injustice), reducing the quality of the remaining pool, which increases the error rate in subsequent cycles, which drives more high-performer exits.

The system does not equilibrate. It degrades monotonically until the quality distribution is flat enough that the ranking system's errors are no longer visible as such.

Conclusion

Forced ranking is not a tool that can be executed well or poorly. It is a structurally flawed system that produces predictable, measurable harm even under perfect conditions. The problem is mathematical, not cultural. Organizations that continue to use it after understanding this are choosing efficiency of process over accuracy of outcome.

Key References

McEntire, J. (2025)

Tournament-Based Performance Evaluation and Systematic Misallocation. arXiv preprint. arxiv.org/abs/2512.06583

Lazear, E. P., & Rosen, S. (1981)

Rank-Order Tournaments as Optimum Labor Contracts. Journal of Political Economy, 89(5), 841-864.

Kahneman, D., & Tversky, A. (1974)

Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124-1131.

Executive Summary

The Statistical Problem

The Adverse Selection Spiral

Conclusion

Key References

Download Full Paper

Explore More Research