Skewed rating results: 2011-02-23 23:50:04 |
Duke
Level 5
Report
|
I realize that they're preliminary until you've reached 10 and they're still somewhat skewed by insufficient volume of data, but I find it really odd that WL Fanatic has a higher rating than me. I have 6 wins and 1 loss against WL Fanatic. WL Fanatic has 1 win (against me) and 1 loss also against me.
It seems impossible to me that his rating result could be higher. Our wins against each other should cancel out and my other 5 wins should make me worth at least a bit more.
Yet he's on the board with an 1885 and I have an 1881.
I'm pointing this out because it looks like an error, not because I particularly care about a preliminary rating -- it'll all eventually work itself out.
Would someone be so kind as to explain how this result is possible.
|
Skewed rating results: 2011-02-23 23:59:12 |
crafty35a
Level 3
Report
|
Duke, with the way this particular rating system works (not that I agree with this), it basically thinks Fanatic is equal to you in strength, because the only data on fanatic is achieving a 50% result against you (1 win, 1 loss). I believe the 4 rating point difference is because when he beat you, he had the second pick in distribution, but when you beat him, you had the first pick. So his win gets slightly more credit.
|
Skewed rating results: 2011-02-24 00:05:00 |
Fizzer
Level 64
Warzone Creator
Report
|
The algorithm does very poorly when it has little amounts of data. Your rating is pretty accurate, but WL Fanatic's is inaccurate.
Wins don't "cancel out" - instead, all it knows about WL Fanatic is that he's beaten and lost to you, so it would put him at the same rating as you. As yours changes, his would too. You would both be tied at 1891 rating points. The only reason he's slightly ahead is because you got first pick in both games which gave him a very small advantage which put him just above you.
This makes me think that maybe ratings should be hidden during the provisional period.
|
Skewed rating results: 2011-02-24 00:06:38 |
Fizzer
Level 64
Warzone Creator
Report
|
I verified this by running the current results without first pick advantage, and here are the results:
1 Fizzer 1949
2 Elucidar 1932
3 TheImpaller 1918
4 bostonfred 1910
5 Duke 1891
6 WLFanatic 1891
|
Skewed rating results: 2011-02-24 00:25:56 |
Duke
Level 5
Report
|
That's not how it's supposed to work (or how USCF rankings work). If I played my first game and somehow beat a master (2000+) I wouldn't suddenly have his ranking. I would see a significant gain, but my rating would jump from 1500 to maybe 1700. You get a huge amount of flux when you first start since you haven't settled into a baseline, but the formula doesn't assign 100% of the points of each player you beat -- that just won't work.
It should be a formula that discounts the winning player's gain or loss by an amount calculated off the expected result to mitigate these crazy results. So if I'm ranked 1600 and I lose to soemone ranked 1500, they would go up to 1550. If I win the game I'd go up a smaller amount (since that was the expected outcome), like 20 points, so I'd go to 1620. But the result is also supposed to weight total games played by each player. So 20 points woudl represent the maximum I could go up with a win and would be reduced by a percentage based on how many games I've played.
Even someone rated 2200 would go up something for beating an 1600, but it would be very very small. If they had 200 games it would go up slightly more than if they had 1000 games -- because the more games you've played the more "established" your ranking.
But the starting formulas appear wrong. Please post the actual formulas and some examples -- because something is clearly not right.
|
Skewed rating results: 2011-02-24 00:27:21 |
Duke
Level 5
Report
|
I wrote that response before your posts Fizz. I still think somethings wrong with giving the winner 1005 fo the loser's points, but I appreciate the prompt thoughtful response (as always).
|
Skewed rating results: 2011-02-24 00:35:07 |
crafty35a
Level 3
Report
|
Right, the Bayesian Elo formula currently used here is much, much different than the more typical system used by the USCF (which I agree would be more appropriate for WL).
|
Skewed rating results: 2011-02-24 00:42:35 |
Fizzer
Level 64
Warzone Creator
Report
|
Part of the problem here is that I presented it wrong. You don't start at 1500 and move up/down. I'm going to change it in the next release so players with 0 games have a rating of 0 - having 1500 is just plain wrong.
He didn't "gain 1005" - the system guessed his rating to be about 1885 based on what it knows.
|
Skewed rating results: 2011-02-24 00:48:41 |
Duke
Level 5
Report
|
http://math.bu.edu/people/mg/ratings/approx/approx.html
does a good job of explaining the old and new USCF system. Thanks for that explanation Fizz. It's not a ranking per se, just an estimated ranking. Ranking would be a different figure. The program would hold that guess until it got enough data to confirm it, but the actual ratings and the estimated ranking could be very different values.
|
Skewed rating results: 2011-02-24 01:01:39 |
Duke
Level 5
Report
|
That "1005" was a typo, it's supposed to read "100% of the loser's points".
|
Skewed rating results: 2011-02-24 01:54:00 |
WL Fanatic
Level 8
Report
|
Because you've won so many, my loss doesn't take much away and my win gets a hell of a lot. In a day or two the three games I'm doing now should finish up and things will probably balance out. Hell, If I lost to you 5 times and you had stats of 6 : 1 I'd probably gain points :P
|
Skewed rating results: 2011-02-24 03:47:08 |
The Impaller
Level 9
Report
|
This Bayesian system is purely designed to rank players in order from best to worst based on the information it sees. How many points you have isn't meant to be like an actual ranking or rating that can be viewed in isolation. It's 100% relevant to the other players around you.
It's not something where one person can be like "Ooh, a 2000 rating, that's really high! You must have had to beat 30 people to get that high" but rather the system could place you at that rating after only one game, if your one game was a win against someone who has won against everyone else. In that situation, the system may deem you to be the best player out of everyone and so it award you as many points are necessary so that you are on top of the point table. After 2 more games that you lose, the system would then determine that you're not the best player and you would drop down to like 5th place or something.
|
Skewed rating results: 2011-02-24 15:44:05 |
Duke
Level 5
Report
|
I just don't like that the system should deem you the best player out of everyone if you only have one win against the best player (currently you), while the best player might have 30 wins and only that one losses.
I'm not saying I don't understand the math, I'm proposing tweaking the system to take into efefct volume of wins, not your win against the highest ranked player you happen to have played.
Would beating everyone on the ladder beworth less than beating you right now? Should it be?
|
Skewed rating results: 2011-02-24 16:04:30 |
crafty35a
Level 3
Report
|
Duke, I don't think there's any way around that with the current Bayesian system in place. That's why Fizzer implemented the provisional period (no ranking before 10 games completed). That said, I know there are some of us who would like to see the Bayesian system replaced with a more typical Elo rating system, which wouldn't have this weirdness. Here are a few reason why I believe this should be done:
- Bayesian system rewards players improperly if their past opponents improve in the future
- Bayesian system is not fair to rapidly improving players, because all games have the same weight, regardless of when they occurred (it works this way because it was designed to rate chess playing AI programs, not people -- AI programs don't improve, so it rightly assumes that all the games have the same meaning)
- Bayesian system is confusing in that your rating can change (sometimes drastically) even if you haven't played any games recently
I'm sure there are more things that I am forgetting, but those are my major concerns.
|
Skewed rating results: 2011-02-24 17:38:03 |
The Impaller
Level 9
Report
|
Crafty is right that the provisional period would take care of that anomaly. The system may put you at number 1 for beating someone who was currently number one, but during the provisional period that won't be shown. Then you have 9 other games to potentially lose and lose rating from, so by the time your rating is official, you won't be ahead of 29-1 guy unless you're 10-0 and even then it would come down to quality of opponents. What jumping you to the top immediately after beating 29-1 guy as your first game DOES do that is positive is it will start pairing you against the other top players, so it more quickly moves you to a place where you might be more adequately matched, than getting 40 points from a normal ELO system would do.
I don't know, I'm not sold on this algorithm yet, but I'm not opposed to it either without seeing how it settles out first.
|
Skewed rating results: 2011-02-24 17:51:34 |
Perrin3088
Level 49
Report
|
the thing is, after the initial players get settled into their positions, new players will join and play people closer to their level, so there won't be any massive jumps anyways.. they will beat someone at 1540 when they are 1500, and thus gain a boost..
the anomolies we are getting currently is due to the fact that new players are playing against people that are actually much higher/lower then they should be, because initially we were all 1500, and that meant the initial matchups were flawed, best player vs worst player, so the system only knows that the worst is worst then the best, which could make him better then an average player that beat another average player, so until further data is added, then it assumes that the worst player is quite good, especially when the best player wins several games before the worst player gets another game completed
|
Skewed rating results: 2011-02-24 21:56:00 |
Math Wolf
Level 64
Report
|
As a statistician, I do see where the even further underlying problem lays.
Bayesian statistics basically use a formula to calculate the posterior distribution (the scores in this case) based on the prior distribution and the data.
The data are known of course so they do not pose a problem.
However, the prior distribution needs to be arbitrarily chosen. There are infinite ways to choose this prior distribution, but most of these don't make any sense.
For example: why would you give Duke a priori a higher score than WL Fanatic if neither of them played a game? (bad example, but you get my point).
So, in most cases, an uninformative distribution is chosen so that there is minimal risk that the posterior distribution is biased.
However, in many cases, this gives problems if there is only limited data.
When there is one datapoint available, this point will be the result for your posterior. When there are two, the (weighted) mean of those two will be used, and so on.
This is basically what is happening here.
How can this be fixed? Add information and make the prior informative.
For example, when a player joins the ladder, assume this player has 2 wins and 2 losses against a fictional 1500 opponent (prior distribution).
When this player has then played his first match, this result will only count for 1/5 of his rating and thus the result will be biased towards 1500.
After several games (20+ let's say), these initial games will hardly count and the ranking will be consequent.
Of course, since the absolute result doesn't count, but rather the relative strength, this small bias doesn't pose a problem.
Therefore, I'd advise to use the method described above to 'fix' the problem. Whether to add 1 win and 1 loss or 2 wins and 2 losses can be discussed, in statistics when working with a binomial, often 2 wins, 2 losses (2 successes and 2 failures for a prior probability of 0.5) are used.
Hope this helps.
|
Skewed rating results: 2011-02-24 22:25:35 |
crafty35a
Level 3
Report
|
Interesting stuff, MathWolf. I agree that what you propose would probably make things work a bit better with this particular system, at least for a player's first few games. Are you familiar with Elo rating systems at all? Because what you propose essentially will make this Bayesian system behave much more like a typical Elo system for a player's first few games (where everyone starts at 1500, and the ratings are not nearly as volatile as they are with Bayesian elo).
|
Skewed rating results: 2011-02-24 22:37:41 |
Ruthless
Level 57
Report
|
Perrin -- I don't think that example works
"
b=best player
w=worst player
c-d=average players
Assume the following are true:
b>w
c=d
b>c
b>d
then
w>c
w>d
"
You would need more information to determine that W>c and d. Just because B>W, doesn't necessarily mean W is better than c or d.
|
Post a reply to this thread
Before posting, please proofread to ensure your post uses proper grammar and is free of spelling mistakes or typos.
|
|