Breaking down The Command Zone Stats

English-language Forums > Commander Discussion

(1/14) > >>

Morganator 2.0:
Previously on Deckstats...

--- Quote from: MustaKotka on April 12, 2019, 10:43:22 am ---Looking at this from a statistical point of view: being short on spells is better than being short on lands (if you need to choose) because usually there are more spells than lands in your deck. Idk if you guys watch Command Zone but they actually did the math on this so it's not just my gut feeling about this: winning players tend to have most land in play.

--- End quote ---

--- Quote from: Morganator 2.0 on April 13, 2019, 04:28:52 am ---I would like to see the stats that The Command Zone has.

--- End quote ---

--- Quote from: Red_Wyrm on April 14, 2019, 06:46:53 am ---They go over sample size etc at the beginning, but they don't exactly give all of the data they get. By this I mean if they ran an ANOVA, T-test Chi-Square (Absolutely no reason to run this one with this type of data) or similar, which I assume they did as they hired a statistician to analyze the data, they didn't give us the correlation coefficient, or a P value to describe if the results were statistically significant or insignificant. They just present the final data. For example: They state having white in your deck leads to your chance to win decreasing by 1% (assuming you start with a 25% chance to win in a 4 player game) and playing red increases it by 3%, and blue green and black were around 8% I think.

So here is the link to part 1: https://www.youtube.com/watch?v=Iwdb_kPCwNU

This is the second video: https://www.youtube.com/watch?v=ttGjuNXWxpY

Oh and they cover the price of the decks and their win% too.

--- End quote ---

--- Quote from: Morganator 2.0 on April 14, 2019, 08:02:47 pm ---Do you know what my favorite thing about Deckstats is? The stats.

Do you know what my most hated thing about The Command Zone is? Their stats.

First off, their data set is super incomplete. There are some instances where the number of lands was just left blank. This doesn't mean that there was no lands (I checked a couple of the videos), it just wasn't recorded. There were two games where mass land destruction was involved (I included those games). I also excluded games where there was no winner, because in all cases we are comparing who won.

But this is still an amazing data set to work with, and I applaud everyone who put this together. It's a big data set, so short of cEDH games, the sample is a good representation of the population.

Question 1: Does having more lands in a game cause you to win?
Null Hypothesis: There is no relation between the number of lands you play and if you won (-0.7>Correlation coefficient<0.7)
Alternate Hypothesis 1: Decks with more lands in play are more likely to win the game (Correlation coefficient>0.7)
Alternate Hypothesis 2: Decks with less lands in play are more likely to win the game (Correlation coefficient<-0.7)

There is an expression among statisticians; If you torture the data enough, you can make it talk. Which is why you want to avoid torturing data, lest you show that green jellybeans cause acne.

Believe me, I tortured this data for a long time. I could not get it to say that the players with more lands in play were more likely to win.

First I just ran the correlation of "Mana producing Lands at end of the Game" versus "Player Won?". So this is comparing across all games (n=304) if the player who won had the most lands. Correlation coefficient= 0.204, so there is no correlation between number of lands in play and who won. But then I did some things I wasn't supposed to (I tortured the data). I started by averaging the number of lands within games, to make a proxy for game length. So if a game had players with 15, 19, 14, and 16 lands, the average was 16, so the game was about 16 turns long. This is unlikely to be the actual game length (keep in mind I'm not supposed to be doing this), but it's a proxy. I then ran the correlation again, this time controlling for game length, to see if players ahead of the mana curve did better. Correlation coefficient= 0.275, so again, no correlation. Finally (really pushing it this time) I did within game correlation. So within each game, did the winning player have the most lands. Correlation coefficient= 0.218, once more no correlation!

Conclusion: I failed to reject the null hypothesis. I can say with confidence that there is no relation between the number of lands you play and if you win.
Interpretation: I think the problem with this analysis is that it only looked at lands. As I said before, mana sources would give a different result. Also, there are a lot of cEDH decks (namely Flash Hulk and Godo) that can easily win with only two lands, but with that early a win, everyone would have 2 lands.
Question 2: Does having Sol Ring or Mana Crypt within your first 3 turns cause you to win more often?
Null Hypothesis: Sol Ring and/or Mana Crypt in your first 3 turns does not have an effect on you winning (-0.7>Correlation coefficient<0.7).
Alternative 1: Players with Sol Ring and/or Mana Crypt in their first 3 turns are more likely to win (Correlation coefficient>0.7).
Alternative 2: Players with Sol Ring and/or Mana Crypt in their first 3 turns are less likely to win (Correlation coefficient<-0.7).

So I should get this out of the way; this null hypothesis sucks. I just can't think of a better way to phrase it. We know that Sol Ring improves the power of your deck, that's why everyone uses it. So this is more measuring the strength of having this early game fast mana.

Running the simple correlation of "If there was a Sol Ring/Mana Crypt" versus "Did that player win?" gives a correlation coefficient of -0.019, so no correlation. But because Sol Ring is such a common card, I frequently saw games where 3 players all had Sol Ring/Mana Crypt, but only one person can win. So this time around, I think it's fair to transform the data. Next I compare "Did the player that won have a Mana Crypt/Sol Ring?" and this is something for a Chi^2 test to handle. A Chi^2 test compares what was expected due to chance (the null hypothesis) compared to what actually happened. The math bit is a little complicated for me to explain, but if you're interested, this was the result.

WinLossTotalHad a ringActual2587112Expected27.9184.09No RingActual2788261104Expected275.09828.91Total3039131216
So instead of me describing how I got to the p-value (0.505 by the way, so not significant). We can just look at the numbers. All the numbers we expect to get are very close to what we actually got.

Conclusion: I failed to reject the null hypothesis. There is no relation between you winning and if you played a Sol Ring in the first 3 turns.
Interpretation: I think this question was asked the wrong way. What it actually should have been is "Do decks with Sol Ring win more often then those without?" The issue is that budget would have an effect (most of the time people don't use Sol Ring because they just don't have one).
Question 3: Which color is the strongest?

This is the point where I really get mad at the way this data set is organised. I'll be back in a few hours to finish this post off.

--- End quote ---

Tonight on Deckstats, I present the analysis to figure out which is the best color in Commander, based on the data from The Command Zone and I also compare which is the best color combination out of all 32.
You know, I just realized… The Command zone paid people to do these stats, and I’m doing it for free.
Step 1: I simplified all of the data so that it made sense.
Decks containing…
White= 537
Blue= 578
Black= 594
Red= 549
Green= 584
Colorless= 7
Number of each deck in each color identity
Colorless= 7
Mono-White= 41
Mono-Blue= 58
Mono-Black= 75
Mono-Red=80
Mono-Green=59
Azorius= 38
Dimir= 45
Rakdos= 42
Gruul= 42
Selesnya= 44
Orzhov= 34
Izzet= 29
Golgari= 46
Boros= 27
Simic= 48
Esper= 35
Grixis= 41
Jund= 25
Naya= 31
Bant= 47
Abzan= 30
Jeskai= 29
Sultai= 36
Mardu= 37
Temur= 27
Anti-Green= 14
Anti-White= 19
Anti-Blue= 18
Anti-Black= 15
Anti-Red= 24
5-color= 73
Part 1: Which color is the best.
I hate this question. If we’re defining the best by number of wins per game played, then the order goes Blue, Green, Black, Red, White. But, we don’t know if this is a significant difference or not, these numbers are really close to each other. For that we use the Chi squared test again. I’m going to skip over most of the math bits this time (but I will show them if someone asks). The p-value of the chi-squared test was 0.162 which is not significant. Therefor with the data presented, we cannot say which is the best color. And I think this comes down to how I had to do this. Because decks can be more than 1 color, things get wonky. So instead, lets look at color identity.
Part 2: Which color identity is best.
This is literally the same thing as before, just with 32 levels instead of 5. But same deal, the p-value of the chi-squared test was 0.216, so not significant.

In case you're interested, I've attached the graphs that show the number of wins per game. Keep in mind that because these results were non-significant, if we played another 300 games of commander, we would see different results.

I know this doesn't look like much, but this took hours (mostly just rearranging the raw data so it makes sense). I'll leave interpretation for later, because we kinda know from experience that white is the worst color.

Soren841:
Notice the order based on the data is exactly what I said. The differences may not be significant, but they reinforce the order that we all know, which I think gives it credibility. Also, most of the better performing color combos had black, blue, green, or some combination of them.. I think red and white are close to each other and the sultai colors are close to each other but red and black are pretty far apart, from experience.

Also..
SANS-Green= 14
SANS-White= 19
SANS-Blue= 18
SANS-Black= 15
SANS-Red= 24

:P

btw r u like some kind of statistician lmao

Morganator 2.0:

--- Quote from: Soren841 on April 16, 2019, 02:43:03 am ---btw r u like some kind of statistician lmao

--- End quote ---

Amateur statistician I guess. My field of study requires me to know about statistics, and I translate my knowledge of statistics to card games.

In other words, I'm not sexually active.

But I'm not done yet. There is still one other thing that's been bothering me.

--- Quote from: Red_Wyrm on April 14, 2019, 06:46:53 am ---For example: They state having white in your deck leads to your chance to win decreasing by 1% (assuming you start with a 25% chance to win in a 4 player game) and playing red increases it by 3%, and blue green and black were around 8% I think.

--- End quote ---

I'm still not quite sure how they came to this conclusion. Once I get some proper sleep I'll lock myself in a dark room to figure this out.

WWolfe:
Love this thread and the series of posts in the other thread that lead to it!

Nothing surprising here as far as wide-spread perception of the best/worst colors. I'm curious to see what happens when you breakdown the CZ's numbers of increased/decreased win probability based on color inclusion.

Morganator 2.0:
So I did something new now; I watched the first video.

At least, the first 20 minutes. These videos are extremely boring. Are there actually people who like this?

But anyway, I found out that the person they hired does have experience with Magic, but not with commander. Honestly.. good enough.
Here is the issue though. The two hosts of this video did not present the stats correctly. At all. I just know that this is going to come back to bite me in the future. At some point, I am going to spend a long time explaining to someone at my local game store that they shouldn't make a Planeswalker deck just because of The Command Zone's stats.

But I digress. What really caught my attention was that the numbers were extracted from the data set with a Python script. That worries me. For something like this you really should use a proper statistical software like R Studio (my preference), SPSS, Mini-Tab, or even Microsoft Excel (for the simple analyses). Still, this guy has a Harvard education, so he should know what he's doing (I now also understand why he got paid).

But the other thing that caught my attention; none of these graphs have error bars. So while a bar graph shows where the data did land, the error bars show where the data could have landed, which is important for statistical significance. As a general rule, if the error bars cross, the data is non-significant. Here's an example:

Continuing with the trend of people wondering what I do on my spare time, this is a graph I made showing the height of Golden Rod flowers, where some have been parasitized (the ones with galls). You can sort of make out that the plants that have been parasitized are slightly shorter, but because the error bars cross, we can't conclude anything. If I had done these measurements again, with the same number of plants, then I could see that the parasitized plants were slightly taller.

Now this might seem hypocritical, because the last two graphs I posted to this thread didn't have error bars. That's because it was late for me, and while Excel can put error bars on graphs, it is not good at it... like... at all. You really have to smack it around to make it work. Instead, I ran the Chi squared test for significance.

Okay, enough ranting (for now). Here are the graphs that The Command Zone showed.

So while I haven't checked, I have a hard time believing that all the people without Sol Ring landed exactly on 25%. That seems like a fudged number.

So this picture isn't actually a statistic, it's just showing how each deck was defined in terms of play style. So you are an enchantment deck if you have 20 or more enchantments.

The more I see of these graphs, the less convinced I am that these are well-tuned decks. No way does combat damage do better than combo.

Now this might be me being nit-picky, but I'm pretty sure the numbers on this graph are wrong. 18%*3+42% makes a total of 96%. You can't just have 4% go missing.

Without p-values or correlation coefficients, these numbers mean nothing. But I'm going to leave these graphs here. I'm going to see if I can re-create them in the future, with error bars.

Navigation

[0] Message Index

[#] Next page

Go to full version