I did an article about how to critically evaluate how to look at under-performing cards and archetypes in your cube and questions that you should be asking yourself.
I really liked that you took the opportunity to delve into the problems of the "cut X to improve Y" philosophy. It was excruciating to watch so many cubes cutting signets and manafixing to "help aggro" instead of addressing the root of the problem.
Just got to say, you've definitely earned distinction as an MTGS hero
Quote from Stardust »
Because he's the hero MTGS deserves, and the one it needs right now. So we'll global him. Because he can take it. Because he's not just our hero. He's a silent guardian, a watchful protector. An expired rascal.
Quote from LuckNorris »
ExpiredRascals you sir are a god-like hero.
Quote from Lanxal »
ER is a masterful god who cannot be beaten in any endeavour.
I assume you're tracking win/loss in games in which the card has resolved and it can have a positive impact? Or are you simply counting the overall win/loss ratio of the deck the card is in?
For example, I can play an Elesh Norn when I'm on 1 life and get Lightning Bolted and lose the game. Is that relevant to chalk a loss in Norn's column in that case? Raw data provides interesting information, but it doesn't provide the best information we can use to evaluate how cards contribute to winning and losing games.
Likewise, maindeck percentage provides information about how versatile a card is in the cube, but it doesn't give us an idea about how good it is, or how powerful the card is when it does resolve, or how critical that card is to the success of certain archetypes.
I feel that dissemination of the MD% data to third party cube groups, without the context of your own group who produced said data, is not useful. But keeping track of cards' maindeck % in your own group is probably of some use, for obvious reasons.
However, there is also the problem that wtwlf mentioned where a card can perform excellently just be in a losing deck owing to luck, poor deck construction, or weakness of the other cards in the deck. Given a sufficiently large population of data, this would average out, but it's something to consider. Also,
Quote from wtwlf »
Likewise, maindeck percentage provides information about how versatile a card is in the cube, but it doesn't give us an idea about how good it is, or how powerful the card is when it does resolve, or how critical that card is to the success of certain archetypes.
...is totally true. I would keep track of maindeck percentages, but tracking players' opinions and personal ratings on individual cards is likely to be the most constructive way of deciding what to remove. For example, Disenchant is going to very highly maindecked, though its W/L ratio is not likely to be hugely indicative of its success/failure.
Lastly, the MD% of a card is not entirely related to its usefulness or even overall power level; even a terrible artifact removal spell would have a very high MD% in a cube where those effects were lacking.
In summary, these data are certainly of use, but quoted alone and with no context, are almost meaningless - and I would certainly not base my cube management on the MD% and W/L% alone.
That said, it is a very good article. It painfully reminded me of trying to convince you lot of how good Stoneforge Mystic was when spoiled. Safe to say I feel smug about that. Just don't bring up Steppe Lynx (which I poo-poo'd).
MD% is a better measure, I think, than how high a card is picked, but needs to be interpreted with caution. Given that you'll usually only be playing with (at best) 2/3 of the cards you draft, there will necessarily be cards that are played less. There are also narrow cards in my cube that are included maybe only in a quarter of decks where they are drafted, but are necessary for certain archetypes.
I still think that, as well as the issues with aggro being bad and too much fixing, the prevalence of 4-5 colour control decks was due to an excess of multicolour spells. Evan Erwin's cube was a particularly well known example. If you didn't draft a load of fixing large numbers of cards were closed off to you, whereas drafting fixing would massively increase your options. For refernce, consider how, by the end of original Ravnica, guild bounce lands were considered almost automatic first picks.
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean - neither more nor less." -Lewis Carroll, Through the Looking Glass
I am not a huge fan of the numbers game. Usually i wouldn't expect there to be a significant amount of data points for each card unless you track it for like two years that see you cubing multiple times a week. Having a wildly different amount of datapoints for each card also makes them hard to compare, which is especially relevant when talking about the viability of new additions. Cards that are perfect for a certain niche but only mediocre in others are not represented in a fair way as well. For that matter, so aren't typical unexciting utility cards like Disenchant or Force Spike. Or cards that heavily depend on other cards, for example Steppe Lynx in a cube without fetchlands vs Lynx with fetchlands.
I think this is a typical case of trying to formalize a thing that is better judged just by subjective impressions of how cards play and how they fit into decks. Too many bordercases and dependencies within the cards make finding a good way to calculate a rating or benchmark impossible.
I'd even argue that depending on these numbers will lead to making less informed decisions than just going by subjective experiences, testing and common sense. Our brain works in much more complicated way and can resolve these dependencies much better than Excel can.
And all that is before someone starts talking about what use he could pull out of other people's numbers. For example, i never felt like Nof's MD% did much for me. Our cubes and our playgroups are wildly different, those numbers simply don't apply to me.
I agree with the sentiment here, but I'm not sure I agree with the conclusion. MD% data can be useful, as long as it is taken with the same grain of salt that subjective feelings are. For example, I would be really interested to know if a card I thought was particularly narrow was being maindecked a ton, or a card I considered to be versatile and powerful was ending up in sideboards a lot.
I strongly disagree with this statement:
I'd even argue that depending on these numbers will lead to making less informed decisions than just going by subjective experiences, testing and common sense. Our brain works in much more complicated way and can resolve these dependencies much better than Excel can.
If you follow sports at all, or know anything about advanced statistics (or psychology, for that matter), you will know that our brains are wildly terrible at objective observation and/or analysis.
Unfortunately, we don't really have the ability to collect high sample size, controlled data (like wizards did with the M11 (?) win % data to disprove the narrative about Overrun). Just because the data will not be perfect doesn't mean we shouldn't collect it.
I'm a Human Resources Analyst by day so I'm always trying to make sense of very subjective things by using objective/statistical information. I constantly need to remind people here that the data collected will never be perfect (ie. reasons for resignations) because it is dependant on factors that are constantly changing. This is very similar to Cube in that cards' values are affected by what other cards are available... as well as the drafting style of each player in a given group (I have a player, a straight up Johnny, who is more than likely to pick the fancy build around me cards over the 'better' cards in a pack -- affecting the contents of a pack in an unconventional way).
There will always be cards that a group finds awesome that other groups won't even bother with (ie. Chandra, the Firebrand in our case) due to having poor testing results. We have to keep in mind that the testing process itself and therefore the results will be different from group to group. Though of course there will be cards that are just universally awesome for everyone (my guess for M14 is Imposing Sovereign).
Are you running cards that aren't really working for your group anymore but are still running it because other Cube Managers swear by it? Most recently for me it was Scroll Rack and Blood Scrivener. Are there cards you are leaving out or not testing at all because no one else is/not a lot of people are (even if your group is hounding you to put it in)? This is Necropotence for me. There is wisdom in considering both statistical data and subjective observations, but as Phantizle said, one should consider both with a grain of salt.
As Usman points out in his article, the only cube that is likely to have enough draft data for really meaningful statistics is the magic online cube. I find that I have a back to front approach, using my gut instinct to determine whether numbers are correct for balance. The subjective decision of whether Chaos Orb is a fun card, however, I have put to the vote.
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean - neither more nor less." -Lewis Carroll, Through the Looking Glass
To post a comment, please login or register a new account.
Enjoy!
I used to write cube articles on StarCityGames, now for GatheringMagic and podcast about cube (w/Antknee42.)
I really liked that you took the opportunity to delve into the problems of the "cut X to improve Y" philosophy. It was excruciating to watch so many cubes cutting signets and manafixing to "help aggro" instead of addressing the root of the problem.
Body Count: GRRRUUUUUUUUUUU
إن سرقت إسرق جمل
Level 1 Judge
My Cube for use with 6th ed. Rules
For example, I can play an Elesh Norn when I'm on 1 life and get Lightning Bolted and lose the game. Is that relevant to chalk a loss in Norn's column in that case? Raw data provides interesting information, but it doesn't provide the best information we can use to evaluate how cards contribute to winning and losing games.
Likewise, maindeck percentage provides information about how versatile a card is in the cube, but it doesn't give us an idea about how good it is, or how powerful the card is when it does resolve, or how critical that card is to the success of certain archetypes.
My 630 Card Powered Cube
My Article - "Cube Design Philosophy"
My Article - "Mana Short: A study in limited resource management."
My 50th Set (P)review - Discusses my top 20 Cube cards from OTJ!
However, there is also the problem that wtwlf mentioned where a card can perform excellently just be in a losing deck owing to luck, poor deck construction, or weakness of the other cards in the deck. Given a sufficiently large population of data, this would average out, but it's something to consider. Also,
...is totally true. I would keep track of maindeck percentages, but tracking players' opinions and personal ratings on individual cards is likely to be the most constructive way of deciding what to remove. For example, Disenchant is going to very highly maindecked, though its W/L ratio is not likely to be hugely indicative of its success/failure.
Lastly, the MD% of a card is not entirely related to its usefulness or even overall power level; even a terrible artifact removal spell would have a very high MD% in a cube where those effects were lacking.
In summary, these data are certainly of use, but quoted alone and with no context, are almost meaningless - and I would certainly not base my cube management on the MD% and W/L% alone.
That said, it is a very good article. It painfully reminded me of trying to convince you lot of how good Stoneforge Mystic was when spoiled. Safe to say I feel smug about that. Just don't bring up Steppe Lynx (which I poo-poo'd).
On spoiled card wishlisting and 'should-have-had'-isms:
I still think that, as well as the issues with aggro being bad and too much fixing, the prevalence of 4-5 colour control decks was due to an excess of multicolour spells. Evan Erwin's cube was a particularly well known example. If you didn't draft a load of fixing large numbers of cards were closed off to you, whereas drafting fixing would massively increase your options. For refernce, consider how, by the end of original Ravnica, guild bounce lands were considered almost automatic first picks.
My 380 Beginners’ Cube on Cube Tutor
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean - neither more nor less." -Lewis Carroll, Through the Looking Glass
I agree with the sentiment here, but I'm not sure I agree with the conclusion. MD% data can be useful, as long as it is taken with the same grain of salt that subjective feelings are. For example, I would be really interested to know if a card I thought was particularly narrow was being maindecked a ton, or a card I considered to be versatile and powerful was ending up in sideboards a lot.
I strongly disagree with this statement:
If you follow sports at all, or know anything about advanced statistics (or psychology, for that matter), you will know that our brains are wildly terrible at objective observation and/or analysis.
Unfortunately, we don't really have the ability to collect high sample size, controlled data (like wizards did with the M11 (?) win % data to disprove the narrative about Overrun). Just because the data will not be perfect doesn't mean we shouldn't collect it.
There will always be cards that a group finds awesome that other groups won't even bother with (ie. Chandra, the Firebrand in our case) due to having poor testing results. We have to keep in mind that the testing process itself and therefore the results will be different from group to group. Though of course there will be cards that are just universally awesome for everyone (my guess for M14 is Imposing Sovereign).
Are you running cards that aren't really working for your group anymore but are still running it because other Cube Managers swear by it? Most recently for me it was Scroll Rack and Blood Scrivener. Are there cards you are leaving out or not testing at all because no one else is/not a lot of people are (even if your group is hounding you to put it in)? This is Necropotence for me. There is wisdom in considering both statistical data and subjective observations, but as Phantizle said, one should consider both with a grain of salt.
My 380 Beginners’ Cube on Cube Tutor
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean - neither more nor less." -Lewis Carroll, Through the Looking Glass