ELO vs Glicko vs TrueSkill: Which Rating System Should You Use?

7 min read

ELO is the classic, but it isn’t the only skill-rating system out there. Glicko and TrueSkill were both built to fix specific weaknesses in ELO. This guide explains what each adds, in plain English, and helps you decide which fits your group.

ELO — the original

ELO assigns every player a single number and updates it after each game based on the result versus expectation. Its great strength is simplicity: you can compute it by hand, explain it in a sentence, and everyone understands it. For a full breakdown see The ELO Rating System Explained.

Its main weaknesses:

It treats a brand-new player and a 500-game veteran with the same uncertainty — it has no concept of confidence.
It assumes you keep playing; a rating doesn’t “decay” or widen if you vanish for a year.
Out of the box it only handles 1v1.

Glicko — ELO plus a confidence interval

Glicko (and its successor Glicko-2), developed by Mark Glickman, keeps ELO’s single rating but adds a rating deviation (RD) — essentially a measure of how sure the system is about your number. A new player has a high RD (we don’t know much yet), so their rating moves fast. A regular has a low RD, so their rating is treated as reliable and moves slowly.

Why that matters:

It handles new players gracefully without you hand-tuning a separate K-factor for them.
If you stop playing, your RD grows over time, so the system openly admits it’s less certain about you.
Glicko-2 adds “volatility” to detect players whose performance is erratic.

Glicko is the natural upgrade when ELO feels too crude but you still want a system people can mostly understand. Many online chess sites use it.

TrueSkill — built for teams and matchmaking

TrueSkill, created by Microsoft Research for Xbox Live, models each player as a bell curve with a mean skill (μ) and an uncertainty (σ). Its headline feature is native support for multiplayer and team games: it can take the result of an 8-player free-for-all, or a 4-vs-4 team match, and correctly update everyone — even inferring how much each individual contributed to a team result.

Best-in-class for matchmaking: pairing players for balanced games is its whole reason for existing.
Converges fast — often a confident rating in a dozen games.
The cost is complexity: the math is genuinely hard, and the numbers are far less intuitive to explain to casual players.

Quick comparison

Simplicity: ELO > Glicko > TrueSkill.
Handles new players well: TrueSkill ≈ Glicko > ELO.
Handles teams / free-for-alls: TrueSkill > the others.
Easy to explain to your friends: ELO wins by a mile.

So which should you use?

A friend group or office league: use ELO. It’s transparent, everyone gets it, and at your volume the extra sophistication of the others buys you almost nothing.
A large online community with players coming and going: Glicko’s uncertainty handling earns its keep.
Team-based or many-player games where matchmaking matters: TrueSkill is purpose-built for it.

For the vast majority of leagues — anything you’d run with friends, coworkers, or a club — plain ELO is the right answer. It’s the system TrackMyElo uses, and the one you’ll be able to explain when someone disputes their ranking.