Just wanted to see what the data looked like. Now that I have it, might as well share it. :)
Basically, while things like Clan League judge a clan by their best players (on some level), I wanted to see how clans looked like in terms of an average encounter between another player and them (so not looking at them in terms of their average player, because this is going to also be weighed by player activity). That's kind of hard to do in general, but it's easy to do within the context of the 1v1 ladder.
There's a bit of a tradeoff between data recency and accuracy, so I just scraped the past 1000 1v1 ladder matches (825 of which were interclan matches) and used them to construct TrueSkill ratings for each clan (by treating each clan as a single player) as well as unclanned players in general. Then I used the TrueSkill ratings to create a 2-D matrix of clan matchups and win probabilities- the value in the matrix corresponds to the probability that the clan on the left beats the clan on the top (i.e., row clan vs. column clan).
You can check this stuff out here:
https://docs.google.com/spreadsheets/d/1LwDMeuTgEWqqImFTuZ2KUl5AIJlOEJCf-o2rz602ie4/edit?usp=sharingThere's clearly some flaws in the data since it's going to underrate (or overrate) clans that don't engage very much in the 1v1 ladder- and clans whose 1v1 ladder activity isn't representative of their memberbase also benefit or suffer similarly (case in point: Apex being underrated).
As far as the color-coding goes: a green row/red column means a clan that the analysis thinks is pretty competitive; a red row/green column means a clan that the analysis thinks isn't very competitive.
(Before you ask why I used TrueSkill instead of just using the matches directly: most of the matchups this is assigning probabilities to haven't actually happened in the past 1000 1v1 ladder games, so I had to extrapolate somehow and TrueSkill seemed like the best option because of the way it models skill/uncertainty). Of course, this also brings me to the biggest flaw in this method: it doesn't actually measure the average encounter as it doesn't take into account the pairing algorithm of the 1v1 ladder. While it says the average encounter between Optimum and ILLUMINATI is going to result in an Optimum win around 89 times out of 100, that's not going to bring me any comfort in a matchup against an ILLUMINATI player because they're probably going to have a rating around mine. But if I just did took that into account, then all the values would be hovering around .500 since the pairing algorithm favors even matchups- but where's the fun in that? Instead, by ignoring the pairing algorithm, I'm using the 1v1 ladder as a proxy for player skill weighted by activity (since activity on the 1v1 ladder is a lot easier to measure than in general).
Edited 12/5/2015 04:05:00