Bayeselo: Difference between revisions
Line 19: | Line 19: | ||
* How would the ratings be different if first-pick advantage was higher or lower? | * How would the ratings be different if first-pick advantage was higher or lower? | ||
Here's the process to re-produce the current rankings: | |||
* Download Bayeselo.exe from [http://remi.coulom.free.fr/Bayesian-Elo/ remi.coulom.free.fr]. | * Download Bayeselo.exe from [http://remi.coulom.free.fr/Bayesian-Elo/ remi.coulom.free.fr]. | ||
Line 26: | Line 26: | ||
ResultSet> | ResultSet> | ||
* Copy the entire contents of BayeseloLog.txt to your clipboard, and paste it into the Bayeselo application. (Note: To paste into a console app on | * Copy the entire contents of BayeseloLog.txt to your clipboard, and paste it into the Bayeselo application. (Note: To paste into a console app on Windows, you can right-click on the titlebar, select Edit then Paste) | ||
BayeseloLog can be obtained separately for each ladder. Here are the links: | BayeseloLog can be obtained separately for each ladder. Here are the links: | ||
Line 35: | Line 35: | ||
This will produce rankings like the following: | This will produce rankings like the following: | ||
Rank Name Elo + - games score oppo. draws | Rank Name Elo + - games score oppo. draws | ||
1 TheImpaller 2087 196 150 20 85% 1792 0% | |||
2 Heyheuhei 2027 100 88 58 79% 1776 0% | |||
3 zaeban 1991 123 109 37 73% 1779 0% | |||
4 chas 1990 267 229 6 67% 1884 0% | |||
5 bostonfred 1966 145 122 32 78% 1713 0% | |||
6 Oliebol 1947 126 114 33 70% 1768 0% | |||
7 Troll 1945 176 158 14 64% 1842 0% | |||
8 gilgamesz 1910 270 233 6 67% 1796 0% | |||
9 TypeSomething 1880 175 152 18 72% 1690 0% | |||
10 Fizzer 1875 138 129 23 61% 1788 0% | |||
11 MathWolf 1872 119 107 38 71% 1686 0% | |||
... | ... | ||
Revision as of 11:39, 9 August 2011
WarLight uses an ELO rating system, similar to what is used in Chess.
How Ranks and Ratings are calculated
More specifically, WarLight uses Bayesian Elo Rating, which has several advantages over other ELO rating systems:
- Beating the same opponent multiple times gives you more rating than beating them once. In most ELO systems, only a win or loss is considered for each opponent.
- This system allows giving an advantage to players that pick first, as described above.
- Bayeselo behaves correctly when opponents' ratings are far apart
- Ratings are calculated based on final ratings, not just what the rating was when the game took place.
The exact algorithm used by this tool is documented on their page, and is not repeated here. The source code is also available for the truly nerdy.
Run your own Ladder Simulations
You can run your own ladder simulations which help to understand how the ratings are calculated. This can be used to answer questions like:
- How would the ratings be different if I had won versus X instead of lost?
- How would the ratings change if I win or lose this in-progress game?
- How would the ratings be different if first-pick advantage was higher or lower?
Here's the process to re-produce the current rankings:
- Download Bayeselo.exe from remi.coulom.free.fr.
- Download BayeseloLog.txt from the links below. This file is automatically generated each time the ladder updates and contains all of the commands needed to re-produce the current ladder rankings.
- Run Bayeselo.exe. You’ll be left at a prompt that says
ResultSet>
- Copy the entire contents of BayeseloLog.txt to your clipboard, and paste it into the Bayeselo application. (Note: To paste into a console app on Windows, you can right-click on the titlebar, select Edit then Paste)
BayeseloLog can be obtained separately for each ladder. Here are the links:
1v1 ladder: http://warlight.net/Data/BayeseloLog0.txt
2v2 ladder: http://warlight.net/Data/BayeseloLog1.txt
This will produce rankings like the following:
Rank Name Elo + - games score oppo. draws 1 TheImpaller 2087 196 150 20 85% 1792 0% 2 Heyheuhei 2027 100 88 58 79% 1776 0% 3 zaeban 1991 123 109 37 73% 1779 0% 4 chas 1990 267 229 6 67% 1884 0% 5 bostonfred 1966 145 122 32 78% 1713 0% 6 Oliebol 1947 126 114 33 70% 1768 0% 7 Troll 1945 176 158 14 64% 1842 0% 8 gilgamesz 1910 270 233 6 67% 1796 0% 9 TypeSomething 1880 175 152 18 72% 1690 0% 10 Fizzer 1875 138 129 23 61% 1788 0% 11 MathWolf 1872 119 107 38 71% 1686 0% ...
Making changes
Now that you can re-produce the existing ladder rankings, you can try making changes and seeing how they affect the results. Here’s the process:
- First, ensure you are on the “ResultSet>” prompt. If you’re in “ResultSet-EloRating>”, enter a command of just “x” to go back up one.
- Enter the command “reset” to clear the previous results. This ensures you’re starting from a clean slate.
- Modify BayeseloLog.txt depending on what you want to try (see below).
- Copy/paste the modified BayeseloLog.txt back into Bayeselo.exe to see the results. Compare to your previous run to see how they changed. In BayeseloLog.txt, you’ll find two large sections – first, a bunch of addplayer commands, then a bunch of addresult commands.
Players
Each addplayer line corresponds to a player participating (or that has participated at one time) in the ladder. They are also numbered, starting at zero and going up.
addplayer Fizzer ;0 addplayer Knoebber ;1 addplayer FBGDragons ;2 addplayer CuChulainn ;3 ... addplayer Perrin3088 ;7 ...
Results
After the players, there are a bunch of addresult commands. Each addresult corresponds to one finished ladder game. In these numbers, we tell Bayeselo what two players fought eachother, who got first pick, and who won. Let’s examine this in detail.
addresult 0 7 2 addresult 1 4 2 addresult 2 7 2 addresult 3 7 0 addresult 4 18 0 ...
By changing these, you can simulate new wins/losses or change existing games to see how they would affect the results.