It's a network graph (like a social network graph- code snippet for generating it can be found at
https://gist.github.com/knyte/0936802177ae53d132dffbd540e9abf9#file-mdl_clustering-py). The connections between the nodes represent the strength of the correlation of player performance between them. I think the small sample size is the main issue.
Also can't really explain why it does that weird stuff with regard to mechanics. I don't have a lot of confidence in these exact clusters + due to the clustering algorithm used, some of these color-coded groups shouldn't really be together (as you can see with Unicorn Island being closer to most of the yellow than the greens). I can't really speculate as to why performance on Georgia Army Cap seems to have a comparatively strong correlation with performance on Battle Islands V- it could be the player pool, some weird similarities in the winning strategy, or just noise since even at this point in the MDL you can't really get accurate individual template-specific ratings for a most players. Since there's so much noise from the small sample set, it's much less likely for a specific conclusion (e.g., "Georgia Army Cap performance correlates closely with Greece LD performance") to be correct than for a general one (e.g., "The templates aren't all similar"). Don't really know how to solve this problem, even if I redid this with 3x the data, so for now I think we'll have to settle with broad-strokes interpretations of the data.
You can see some of the internals of this- a visual representation of player ratings on each template ("Ratings (Visual)") and of correlation of performance between templates ("Correlation (Visual)") at
https://bit.ly/mdl-analysis.
I'll just put the images here for convenience, though. Here's the performance correlations between templates (note that even the really green tiles are relatively weak correlations by normal standards- that's partly due to noise and partly since no one template really predicts any other close to perfectly):
And player ratings on each template:
The ratings suffer from the same inaccuracy/noise issue as the correlations, so we can't *really* be sure that (for example) Timinator is as good as he appears to be on Battle Islands V, but we can be much more sure that Timinator was, as of the time of this analysis, pretty good at the template and better than the average player. Don't have enough data to be confident in narrow conclusions, so just have to settle for being more confident in broad-strokes ones.
Edited 2/26/2018 21:10:56