It's a showtrick more than an actual shuffle. I included it mainly so people could understand why the impreciseness of the riffle shuffle is so important to its efficacy, that if you riffle shuffle perfectly it actually doesn't randomize at all.
Still, in the hands of a master, different faro shuffles in sequence can stack a deck of 52 cards in the same way piling can, and while the math is slightly different for a 60 card deck it wouldn't take a genius to figure it out. Still something to watch out for and to be avoided.
I never really thought about it before, though I did know about it. I thought it would have been slightly more useful if someone had mastered it, but again, I never really put more than a few seconds thought into it.
There are way too many shuffling myths floating around.
Private Mod Note
():
Rollback Post to RevisionRollBack
Lycanthropy Awareness Day.
Hoping for a cure, or at least an outbreak.
Pretty interesting overall - looks relatively equal across the board. I would consider, just from looking at this, that mashing and riffling are probably about the same or at least close enough to not matter.
I'll probably just change my routine to do a quick pile to count the deck and then mash 10 and go. Seems fast and at the end of it all, you're going to be averaging 2 bad pockets per shuffle - you just have to hope your opponent doesn't cut to them.
Also did a quick test of Cockatrice and while the sample size was small, it seemed equal.
Excellent article! Apart from clearing up several myths that seem to plague these boards when it comes to shuffling, you linked to several great academic articles I had honestly never heard of - well done.
I have one question, though - in your post, you say that riffling is significantly better than mashing because with mashing that player can Faro shuffle - can't a player Faro shuffle via riffle as well (if they're good enough)? Layering 1-1 can certainly be done with a riffle if you're good enough at manipulating cards.
You didnt seem to mention the benefits of pile shuffling, for example:
- checking the number of cards in the deck to ensure none are missing
- making sure none of the cards are stuck together
~ Tim
Truth. I have saved myself from many game losses by piling out my cards to make sure I have a legal number of cards in my deck and my sleeves are not marked.
But otherwise, pile "shuffling" does not randomize your deck as confirmed by OP.
Understand, Dredge is not really a Magic: The Gathering deck. When a card is playable in it, it doesn't mean it's a tournament playable card. It means it's playable in whatever crazy fantasy world that Dredge operates in.
I have one question, though - in your post, you say that riffling is significantly better than mashing because with mashing that player can Faro shuffle - can't a player Faro shuffle via riffle as well (if they're good enough)? Layering 1-1 can certainly be done with a riffle if you're good enough at manipulating cards.
It can be done with a genuine riffle, you're right, but it's much harder. Your thumbs have to be so perfectly gripped to the edges of your cards, so perfectly elevated that you drop only one at a time from either pile. The mechanics of the riffle lend itself to unpredictable error (and thus randomness) moreso than a mash, where you can simply square the cards up, fan out the edges and slide them into their perfect places.
It's sort of like card counting in blackjack; it can be done with a multiple-deck shoe over a single-deck shoe, sure, but it's significantly harder with a multiple-deck shoe to the point that every casino uses multiple-deck shoes almost exclusively (I've seen tables that use single-deck shoes, but the payout is lower).
My assertion that riffling when possible is superior is based on a combination of the above and the fact that mashing is potentially less efficient if you're not doing it properly.
I'll concede, though, as I have many times on this thread already, that if you're mashing, you know, 15 times, and your mashes are pretty riffle-like, and you're not cheating with it, then that probably gets the job done just as well.
You didnt seem to mention the benefits of pile shuffling, for example:
- checking the number of cards in the deck to ensure none are missing
- making sure none of the cards are stuck together
~ Tim
I said it on page two, there's nothing heinous about piling up your cards to ensure they are of the proper number or quality before a round. The problem arises when people begin to think of pile shuffling as a randomization method, as an actual replacement for other methods of shuffling, as all too many people--not only casual players but even some pros and judges--already do.
I riffle all the time because I'm a riffle disciple, but if I had to pile just to quality-control my deck, I'd probably just deal them all into two piles really quickly to get that out of the way, then I'd riffle them 7 to 8 times and be done with it.
I don't see any reason, if you're just piling for quality-control, to do it more than once a game, or to make any more piles than you need to count to 60 (in the Flores article you see there's a video talking about using 7 or 11 piles [or something like that] because those are prime numbers or whatever, that's all BS).
When I mash shuffle, I split the deck in certain locations, and offset the pile of cards a bit, so that each half of each part is mashed together, and I mash them 12 times. That way I won't feel like I am mashing the cards back to their original positions if I mashed the two halves full on.
So what I'm saying is, I try not to split the deck at the mid point, and I offset the mash shuffles.
I'll probably just change my routine to do a quick pile to count the deck and then mash 10 and go. Seems fast and at the end of it all, you're going to be averaging 2 bad pockets per shuffle - you just have to hope your opponent doesn't cut to them.
When I mash shuffle, I split the deck in certain locations, and offset the pile of cards a bit, so that each half of each part is mashed together, and I mash them 12 times. That way I won't feel like I am mashing the cards back to their original positions if I mashed the two halves full on.
How tight are your mashes? That's the real question. The two key elements of riffling are
the deck is not split into exactly even halves
the cards are not meshed perfectly together; there are some clumps of two and even a few clumps of three from each pile; in fact the probability of dropping a larger clump increases depending on the size of the pile because of the weight in your hand (according to the mathematical interpretation of a riffle).
If you're an ardent crusader against the card-bending forces of the riffle, then those qualities are what you should be trying to emulate in your mashes, though the second one is not the easiest to replicate.
As for offsetting the mash, signofzeta, you're really only mashing half your deck each of those times, which actually detracts from your randomness more than anything. You'd probably be better off just mashing normally those twelve times. You can even cut the deck a few times in between if it puts your mind at ease regarding the placement of the cards.
I was paying close attention the entire time - my riffles look a lot like mashes, with 1-2 cards usually alternating (very occasionally, 3).
It was really late so I'll talk more about my data:
I was interested in a few things: mashing vs. riffling, my perception of my own routine, and the effects of riffling or mashing in the long term (not resetting back to 2 piles of spells and lands).
The sample size is small at 10 per treatment, but there is variance between the treatments. You can see from the last column that riffling 7x without resetting produced the best result - which I would expect. However, 7x mashing without resetting didn't improve results over mashing and resetting so clearly my sample size is too small because it should have a similar effect there honestly. My routine didn't perform any better, however, at 2.3 it is near the mean so I am atleast happy about that.
I am concerned about the column 7x riffle. I can assure you guys that I didn't try to influence my results - I riffled very carefully and got this result. 2.6 seems high to me - clearly above the others. Significant? I'm not sure (you can't do a significance test with this sample size), so I think more testing should be done.
As it stands, I think I was right in my statement of "no thank you to riffling 7 times and going". Just because 7 riffles will produce mathematically true random, doesn't mean I am rule bound to oblige. I will happily riffle that thing 3 more times if it takes me from 2.6 clumps to 2.2 or fewer. I do not believe this is cheating, but if somebody would want to correct me that'd be great.
I've wanted to do a true experiment like this for awhile and it was really good to actually see the results. I might replicate it again to increase my sample size sometime this week or if others would like to contribute, that would be great.
As a compelte aside, I've had an idea floating for awhile and you can tell me what you think -
If you were to take a reset deck (divided by spells/lands) and riffle or mash three times, you wouldn't expect optimal distribution. Even at 7 riffles, I'm having my doubts - my gut is telling me I should go to 10. But if you then riffle or mash 3 more times, are you still increasing or is there a period where clumping begins to happen again?
I've imagined that the expected clumps as a graph against number of riffles might appear wavelike like a sin or cos function. Doing an experiment on this would be quite time consuming, however, as you'd need an extremely large sample size to confirm any kind of clumping behavior.
And ultimately you're probably talking about differences that are extremely small - like an altitude difference of .2 expected clumps.
It was really late so I'll talk more about my data:
I'm eager to see what you've come up with. I'd be glad to find the statistical significance of your data if I can wrap my head around it.
Just to be clear, what exactly is represented in that table? What precisely do the columns represent (ie what's the difference between "x7riffle" and "x7 no reset riffle") and what is being counted in each of the cells (the rows don't have labels)? I assume the bottom row is a total of some kind?
Each column are the treatments - I reset the deck back to lands/spells at the start of each trial. I did not reset the deck in the "no reset" treatments - I wanted to see what long term shuffling would do (theoretically it should have been improving the results but I wanted to confirm).
So for example in column 1 I started with a reset deck, mashed 7 times, counted the number of clumps (I got 3 clumps of 5 spells or lands), then reset the deck to lands/spells, and repeated. I did this 10 times and then totaled.
You can imagine this took a little bit of time.
You can't do a significance test a sample size of 10. I'd have to double or triple the number of times I conducted the entire thing to get results that were valid. I'm tempted to do it - particularly on the 7x riffle column, to see if that 26 (2.6 clumps per trial) is legitimately high or just an aberration.
Firstly and foremostly, you're approaching the issue from the wrong direction. Like I said in my disclaimer and like Flores hints in his article on cheating, how random the deck "needs to be" or "should be" is up to you and your particular goals. My advice on shuffling, which is itself just a condensation into readable, jargon-less form of the work done by dozens of statisticians over the past several decades, is if you're looking to reach the mathematical definition of randomness, true randomness, the uniform distribution.
My problem with your analysis is that I don't see that as a useful goal. You don't have to be close to true randomness for your deck to be functionally randomized. I think you significantly overestimate how random a deck actually needs to be.
(Additionally and somewhat tangentially, keep in mind that generally when shuffling your deck is already somewhat randomized.)
That said, I'll still provide a conceptual way of solving your problem just so I can put my money where my mouth is--though unfortunately I won't be able to provide an exact answer.
In order to "solve" (or, in this case, take an educated guess at, since when we're trying to extrapolate general truths from specific samples, an educated guess is the best we can do), I'd count the number of rising sequences in each of those five decks and then plug them into this equation:
Indeed, this is precisely what you would need to do to complete your argument.
My basic point with the examples was that if you can't tell the difference between a deck shuffled one way and a deck shuffled another way, there is no functional difference between the shuffling methods. And yes, manually looking is not a good way to tell the difference.
Indeed, this is precisely what you would need to do to complete your argument.
Well I'll keep digging around for a program out there that would count up rising sequences in a set of data. Having to do it by hand is extremely tedious and error-prone.
My basic point with the examples was that if you can't tell the difference between a deck shuffled one way and a deck shuffled another way, there is no functional difference between the shuffling methods. And yes, manually looking is not a good way to tell the difference.
I understand what you're driving at. You're saying that if "mostly-random" gets you the same results during gameplay as "perfectly random," then mostly random is acceptable, because at the end of the day all we care about is the gameplay (that is, "functional") results.
And you're right: in the short run, you probably won't notice the difference. In the short run you might not even be able to calculate the difference based on the sample, just like, if I were to answer your question, in which you only gave me five hands, I could only give you a best-guess based on the "most likely" answer, and even if I'm 80% sure I'm right, I'm still wrong 20% of the time.
Statistics, though, is about the long run (summarized in the law of large numbers). As you deal more and more hands, like hundreds of hands, from a perfectly-randomized deck and a mostly-randomized deck (as your sample size increases), the difference in gameplay does in fact become statistically significant--moreover even noticeable to the untrained observer.
That's why the house edge in a casino on games like blackjack is often around only 00.20% to 00.50% (less than half a percent), but the casino still makes money, because as millions of people play millions of hands, the expected event--casino profits, players lose--becomes the actual event, or, rather, the actual events trend toward the expected event.
In the same vein, if you've been shuffling only one or two times your whole life and you played one hand on the MTGO shuffler, which is perfectly random, would you notice the difference? Probably not. What about fifty hands? Maybe not. Hundred hands? Thousand hands? Suddenly you find the MTGO shuffler is doing it differently than what you're used to.
That's what I mean when I say (with complete respect) that you're looking at it from the wrong perspective. Statistics as a field is about the long run--we even define "probability" as the "long-run relative frequency" of an event.
I understand what you're driving at. You're saying that if "mostly-random" gets you the same results during gameplay as "perfectly random," then mostly random is acceptable, because at the end of the day all we care about is the gameplay (that is, "functional") results.
And you're right: in the short run, you probably won't notice the difference. In the short run you might not even be able to calculate the difference based on the sample, just like, if I were to answer your question, in which you only gave me five hands, I could only give you a best-guess based on the "most likely" answer, and even if I'm 80% sure I'm right, I'm still wrong 20% of the time.
Statistics, though, is about the long run (summarized in the law of large numbers). As you deal more and more hands, like hundreds of hands, from a perfectly-randomized deck and a mostly-randomized deck (as your sample size increases), the difference in gameplay does in fact become statistically significant--moreover even noticeable to the untrained observer.
The key question is how many games will I need to play before I see this effect.
For example, if using a different shuffling technique makes a difference 1 in every 10 billion games, I'll likely never play a game where it does.
Even if the number is much lower, I may be okay with it making a difference occasionally if in return I gain the benefit of spending much less time shuffling.
While interesting, this is not really completely applicable to MTG. Sleeves react differently to different shuffles, and card quality means that card conditoin matters to.
I have destroyed more sleeves mash shuffling than I can even imagine. And I have seen decks worth thousands of dollars get noticeably bent by riffle shuffling.
I pile shuffle and then casino riffle shuffle (you riffle the edges together and then carefully merge the piles) to go the easiest I can on my deck. I care far less about the probability of randomization than damaging a $500 deck.
Also, not to get into the ethics of the whole thing, but "completely random" isn't the most desirable state a deck can be in. Not even close. It doesn't really matter how randomized a deck is, sometimes it will be less random in your favor and sometimes it will be less random not in your favor. Randomization does not guarantee you any particular outcome, so who really cares how random your deck is? "Sufficiently random to avoid cheating" is the only threshold you need to worry about, and you certainly don't need to riffle shuffle 8 times to do that.
For example, if using a different shuffling technique makes a difference 1 in every 10 billion games, I'll likely never play a game where it does.
I admire your continued skepticism!
Generally even the smallest probabilities tend to make a significant (even observable) difference once the sample size reaches several hundreds or a few thousands. I'm sure you personally draw 7 after shuffling at least several hundred times in your career, no less all the other MTG players in the world, and together it quickly adds up to hundreds of thousands of samples. The number is certainly not in the billions.
Observe, as a simple example, the results of rolling a die, for which the expected value is easy to calculate: 3.5 (add up 1 + 2 + ... + 6 and then divide the sum by 6, since in a fair die each face has an equal chance of coming up). As shown in this graph below, by around 500, 600, 700 trials the average of the values of each die roll trend toward the expected value of 3.5:
If we used a weighted, unfair die, such that the probability of rolling a 5 or a 6 was slightly higher...let's say you had a 0.25 chance to roll a 5 or a 6 and only a 0.1666 chance to roll a 1, 2, 3, or 4, then expected value would turn out to be 4.416 instead of 3.5...and if you rolled this "unfair die" several hundred times, you'd clearly notice the die rolls approaching a different average as you rolled the die a few hundred times (the line graph would be up by 4.5, not 3.5).
Even if the number is much lower, I may be okay with it making a difference occasionally if in return I gain the benefit of spending much less time shuffling.
A good riffle/mash can be performed in just a few seconds. The time argument is moot.
Quote from Mike Flores »
I am going to be generous and say that it takes you 30 seconds to pile shuffle your deck one time. In actuality it probably takes you 45 seconds if you are very good with your hands, but I am going to say that it only takes you 30 seconds. One pile shuffle doesn’t even distribute your mana clumps.
On balance, given 30 seconds you can riffle shuffle your deck 15 times.
.
I have destroyed more sleeves mash shuffling than I can even imagine. And I have seen decks worth thousands of dollars get noticeably bent by riffle shuffling.
This is a consideration most certainly, but there are ways to deal with this. Double sleeve, edge riffle, what have you. Better to find a way to make riffling/mashing work than to give up on them altogether and fall back to pile shuffling.
I pile shuffle and then casino riffle shuffle (you riffle the edges together and then carefully merge the piles) to go the easiest I can on my deck. I care far less about the probability of randomization than damaging a $500 deck.
If you're edge riffling 7 times, then you're golden. It's still riffling.
Also, not to get into the ethics of the whole thing, but "completely random" isn't the most desirable state a deck can be in. Not even close. It doesn't really matter how randomized a deck is, sometimes it will be less random in your favor and sometimes it will be less random not in your favor. Randomization does not guarantee you any particular outcome, so who really cares how random your deck is? "Sufficiently random to avoid cheating" is the only threshold you need to worry about, and you certainly don't need to riffle shuffle 8 times to do that.
Well, you're getting into the ethics of the whole thing. As I said in my disclaimer, my thread is to instruct people on what "true randomness" really means and how to achieve it. It is not the "optimal" way to prepare a deck if your only goal is winning; mana-weaving, "double nickeling" as Flores describes, etc. will always give you a competitive edge. If you don't want to achieve true randomness, or if you think you can get away without true randomness and not get caught with an "insufficient deck shuffling penalty" (see below), that's your decision, not mine.
Quote from Lee McLain, Ohio, L3 »
Insufficiently randomizing a deck is something that sends up a warning flag to other players and judges alike. In the Penalty Guidelines the penalty for insufficient deck shuffling is a Warning. That is the penalty for an unintentional infraction. If a judge determines that the infraction was intentional, it will be upgraded to Cheating, which will result in disqualification and an investigation by the DCI.
A mash shuffle should virtually never damage the cards if they are all facing the same direction. And if you're still worried about damage - double sleeve.
Piling repeatedly is unacceptable for randomization purposes compared to mashing or riffling.
If I'm playing against people and they mash/riffle at least 7 times I generally just cut once at an arbitrary spot in the middle region of the deck. If I see you pile repeatedly, I'm going to mash shuffle your deck a few times and then cut it.
Generally even the smallest probabilities tend to make a significant (even observable) difference once the sample size reaches several hundreds or a few thousands. I'm sure you personally draw 7 after shuffling at least several hundred times in your career, no less all the other MTG players in the world, and together it quickly adds up to hundreds of thousands of samples. The number is certainly not in the billions.
Observe, as a simple example, the results of rolling a die, for which the expected value is easy to calculate: 3.5 (add up 1 + 2 + ... + 6 and then divide the sum by 6, since in a fair die each face has an equal chance of coming up). As shown in this graph below, by around 500, 600, 700 trials the average of the values of each die roll trend toward the expected value of 3.5:
If we used a weighted, unfair die, such that the probability of rolling a 5 or a 6 was slightly higher...let's say you had a 0.25 chance to roll a 5 or a 6 and only a 0.1666 chance to roll a 1, 2, 3, or 4, then expected value would turn out to be 4.416 instead of 3.5...and if you rolled this "unfair die" several hundred times, you'd clearly notice the die rolls approaching a different average as you rolled the die a few hundred times (the line graph would be up by 4.5, not 3.5).
This isn't a valid argument. The die rolling example you give has no relevance to what we're talking about. There's nothing magical about a sample size of a few hundred. Depending on the math of the two shuffling methods we're considering, the needed sample size to expect to notice a difference could be much, much bigger.
A good riffle/mash can be performed in just a few seconds. The time argument is moot.
A few seconds multiplied by a large number of shuffles is quite a bit of time.
This isn't a valid argument. The die rolling example you give has no relevance to what we're talking about. There's nothing magical about a sample size of a few hundred. Depending on the math of the two shuffling methods we're considering, the needed sample size to expect to notice a difference could be much, much bigger.
The die-rolling example goes to show you that it doesn't take an exceedingly large sample size (like in the billions) to begin to observe the effects of the Law of Large Numbers. It usually relatively soon, in the hundreds or thousands, whether you're rolling a die or flipping a coin.
But here's a more concrete example: there is no mathematics to determine when you the observer would notice a difference. There are, however, mathematics to determine when the difference becomes statistically significant--that is, when the probability that an observed difference is actually a difference and not a fluke. This is summarized in two related concepts called "statistical power" and "statistical sensitivity" (these are complicated to calculate; computers do it for us). Power increases with sample size; statisticians often endeavor to calculate how many "flukes" you'd have to see before the pattern becomes mathematically justified.
When you reach 0.80 power, it's safe to say that whatever pattern you're seeing is actually a pattern, a noticeable difference.
What these two charts are doing is calculating the sample size it would achieve to reach a power value of 0.80, the number widely accepted by the research world as the ideal power value.
In order to note a difference, we need to have a starting point. It doesn't really matter what that starting point is. Let's say for this example that you normally win or lose about half your games, so our null proportion is 0.5. This is what we're trying to disprove.
Let's say, because we're skeptics, that after starting your new, more random shuffling method, your win/loss increased (or decreased, doesn't really matter) by barely anything, just 1%. Thus our "test proportion" is 0.51.
The calculator says that we would receive 0.80 power with after 20,000 games. That's quite a lot of games, you say! More than you'll ever play in your life! True, but think of all magic players everywhere?
Still, let's move further; what if, after starting your new, more random shuffling method, your win/loss increased or decreased by 2% instead of 1%? Still barely anything. Thus our "test proportion" is 0.52. As you can see, 0.80 power comes much sooner, around only 5000 games.
Whether you want to look at it visually on a graph or conceptually by the numbers, the fact remains that we're not talking about sample sizes in the hundred thousands, millions, or billions, here, even for exceedingly small differences. Moreover, 1% and 2% represent a large number of games in the long run as you play more and more.
Quote from fnord »
A few seconds multiplied by a large number of shuffles is quite a bit of time.
The number is 7. Seven riffles. Eight if you want to be really safe. That's all it takes.
fnord, not only is it extremely easy to tell which of those decks was inadequately shuffled, you could gain a very significant advantage by using that method in a game.
Here are your sequences, with cards number 1-20 lands and cards number 21-60 spells:
The number of runs in the sequences are respectively 37, 32, 31, 24, and 23. The expected number of runs is 2*(800/60)+1 = 27.7 and the standard deviation is 3.4. The corresponding 2-tailed p-values for the sequences are 0.006, 0.20, 0.34, 0.28, and 0.17. (See this page for information on the runs test.)
I consider this conclusive evidence that the first sequence was not produced by a random process. And I think it is self-evident that having more runs of lands and spells (i.e. less clumping) is preferable in this game. So not only did you fail to randomize the first deck, you could actually have gained an advantage by doing so.
fnord, not only is it extremely easy to tell which of those decks was inadequately shuffled, you could gain a very significant advantage by using that method in a game.
How did you get the runs to use in the test? I was trying to count the rising sequences by hand to use in the probability calculation but that would've taken me forever.
And I think it is self-evident that having more runs of lands and spells (i.e. less clumping) is preferable in this game. So not only did you fail to randomize the first deck, you could actually have gained an advantage by doing so.
Thank you, TheLizard, for doing what I could not. If it weren't the weekend I'd have SAS at my fingertips, and SAS has a sequencing command, but unfortunately I was left to pen and paper, and that's just ugh.
As a further note, I'd just like to point out that this is what I mean when I say that randomizing isn't necessarily "optimal." To put what TheLizard was saying into context, Fnord's randomized decks had fewer runs and therefore larger "clumps" of lands and spells, which means he'd actually run into mana screw more frequently playing a fully randomized deck than his nonrandomized one. This is why people complain about the MTGO shuffler, and this is why you can't trust your gut when something "seems random."
I am annoyed that after 4 pages no one mentioned the greatest conclusion of the study that was done around 2004-2005 and so discussed within the judging community.
MIXING SHUFFLING STYLES AND TECHNIQUES GREATLY INCREASES THE SPEED AT WHICH YOU ACHIEVE "PERFECT" RANDOMIZATION.
Very informative post. Nothing I didn't already know, but point taken.
My response.
No way in HELL do I ever rifle shuffle my cards or let anybody rifle shuffle my cards for me. I'll sooner concede the match before I allow it. Magic cards are too damn expensive and rifle shuffling takes a terrible toll on them. So it ain't happening.
Here is how I shuffle my deck and I find it is more than random enough for my purposes.
Step 1 - I mash shuffle about 3 or 4 times after doing an overhand shuffle about 6 to 8 times.
Step 2 - I then pile shuffle the cards into 8 piles.
Step 3 - I pick up the piles, 2 at a time chosen in somewhat random order, and then overhand shuffle those together, taking the next 2, shuffling those together and them combining with the previous 2 and shuffling those together and continue this until all 8 piles are shuffled together.
Step 4 - I then take the deck and repeat step 1.
Does it take a little longer than 8 rifle shuffles? Sure. Is it just as random. I am quite certain it is going by my results. Is it much less wear and tear on my cards? Absolutely.
And ultimately, that's all I care about even if I have to take an extra minute or 2 shuffling my cards.
Private Mod Note
():
Rollback Post to RevisionRollBack
To post a comment, please login or register a new account.
I never really thought about it before, though I did know about it. I thought it would have been slightly more useful if someone had mastered it, but again, I never really put more than a few seconds thought into it.
There are way too many shuffling myths floating around.
Hoping for a cure, or at least an outbreak.
Level 1 Judge (yay)
^ experiment results
Pretty interesting overall - looks relatively equal across the board. I would consider, just from looking at this, that mashing and riffling are probably about the same or at least close enough to not matter.
I'll probably just change my routine to do a quick pile to count the deck and then mash 10 and go. Seems fast and at the end of it all, you're going to be averaging 2 bad pockets per shuffle - you just have to hope your opponent doesn't cut to them.
Also did a quick test of Cockatrice and while the sample size was small, it seemed equal.
I have one question, though - in your post, you say that riffling is significantly better than mashing because with mashing that player can Faro shuffle - can't a player Faro shuffle via riffle as well (if they're good enough)? Layering 1-1 can certainly be done with a riffle if you're good enough at manipulating cards.
GX Tron XG
UR Phoenix RU
GG Freyalise High Tide GG
UR Parun Counterspells RU
BB Yawgmoth Token Storm BB
WB Pestilence BW
Truth. I have saved myself from many game losses by piling out my cards to make sure I have a legal number of cards in my deck and my sleeves are not marked.
But otherwise, pile "shuffling" does not randomize your deck as confirmed by OP.
Modern:
Something new every week
Legacy:
Something new everyweek
It can be done with a genuine riffle, you're right, but it's much harder. Your thumbs have to be so perfectly gripped to the edges of your cards, so perfectly elevated that you drop only one at a time from either pile. The mechanics of the riffle lend itself to unpredictable error (and thus randomness) moreso than a mash, where you can simply square the cards up, fan out the edges and slide them into their perfect places.
It's sort of like card counting in blackjack; it can be done with a multiple-deck shoe over a single-deck shoe, sure, but it's significantly harder with a multiple-deck shoe to the point that every casino uses multiple-deck shoes almost exclusively (I've seen tables that use single-deck shoes, but the payout is lower).
My assertion that riffling when possible is superior is based on a combination of the above and the fact that mashing is potentially less efficient if you're not doing it properly.
I'll concede, though, as I have many times on this thread already, that if you're mashing, you know, 15 times, and your mashes are pretty riffle-like, and you're not cheating with it, then that probably gets the job done just as well.
I said it on page two, there's nothing heinous about piling up your cards to ensure they are of the proper number or quality before a round. The problem arises when people begin to think of pile shuffling as a randomization method, as an actual replacement for other methods of shuffling, as all too many people--not only casual players but even some pros and judges--already do.
I riffle all the time because I'm a riffle disciple, but if I had to pile just to quality-control my deck, I'd probably just deal them all into two piles really quickly to get that out of the way, then I'd riffle them 7 to 8 times and be done with it.
I don't see any reason, if you're just piling for quality-control, to do it more than once a game, or to make any more piles than you need to count to 60 (in the Flores article you see there's a video talking about using 7 or 11 piles [or something like that] because those are prime numbers or whatever, that's all BS).
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
So what I'm saying is, I try not to split the deck at the mid point, and I offset the mash shuffles.
How tight are your mashes? That's the real question. The two key elements of riffling are
As for offsetting the mash, signofzeta, you're really only mashing half your deck each of those times, which actually detracts from your randomness more than anything. You'd probably be better off just mashing normally those twelve times. You can even cut the deck a few times in between if it puts your mind at ease regarding the placement of the cards.
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
I was paying close attention the entire time - my riffles look a lot like mashes, with 1-2 cards usually alternating (very occasionally, 3).
It was really late so I'll talk more about my data:
I was interested in a few things: mashing vs. riffling, my perception of my own routine, and the effects of riffling or mashing in the long term (not resetting back to 2 piles of spells and lands).
The sample size is small at 10 per treatment, but there is variance between the treatments. You can see from the last column that riffling 7x without resetting produced the best result - which I would expect. However, 7x mashing without resetting didn't improve results over mashing and resetting so clearly my sample size is too small because it should have a similar effect there honestly. My routine didn't perform any better, however, at 2.3 it is near the mean so I am atleast happy about that.
I am concerned about the column 7x riffle. I can assure you guys that I didn't try to influence my results - I riffled very carefully and got this result. 2.6 seems high to me - clearly above the others. Significant? I'm not sure (you can't do a significance test with this sample size), so I think more testing should be done.
As it stands, I think I was right in my statement of "no thank you to riffling 7 times and going". Just because 7 riffles will produce mathematically true random, doesn't mean I am rule bound to oblige. I will happily riffle that thing 3 more times if it takes me from 2.6 clumps to 2.2 or fewer. I do not believe this is cheating, but if somebody would want to correct me that'd be great.
I've wanted to do a true experiment like this for awhile and it was really good to actually see the results. I might replicate it again to increase my sample size sometime this week or if others would like to contribute, that would be great.
As a compelte aside, I've had an idea floating for awhile and you can tell me what you think -
If you were to take a reset deck (divided by spells/lands) and riffle or mash three times, you wouldn't expect optimal distribution. Even at 7 riffles, I'm having my doubts - my gut is telling me I should go to 10. But if you then riffle or mash 3 more times, are you still increasing or is there a period where clumping begins to happen again?
I've imagined that the expected clumps as a graph against number of riffles might appear wavelike like a sin or cos function. Doing an experiment on this would be quite time consuming, however, as you'd need an extremely large sample size to confirm any kind of clumping behavior.
And ultimately you're probably talking about differences that are extremely small - like an altitude difference of .2 expected clumps.
I'm eager to see what you've come up with. I'd be glad to find the statistical significance of your data if I can wrap my head around it.
Just to be clear, what exactly is represented in that table? What precisely do the columns represent (ie what's the difference between "x7riffle" and "x7 no reset riffle") and what is being counted in each of the cells (the rows don't have labels)? I assume the bottom row is a total of some kind?
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
Each column are the treatments - I reset the deck back to lands/spells at the start of each trial. I did not reset the deck in the "no reset" treatments - I wanted to see what long term shuffling would do (theoretically it should have been improving the results but I wanted to confirm).
So for example in column 1 I started with a reset deck, mashed 7 times, counted the number of clumps (I got 3 clumps of 5 spells or lands), then reset the deck to lands/spells, and repeated. I did this 10 times and then totaled.
You can imagine this took a little bit of time.
You can't do a significance test a sample size of 10. I'd have to double or triple the number of times I conducted the entire thing to get results that were valid. I'm tempted to do it - particularly on the 7x riffle column, to see if that 26 (2.6 clumps per trial) is legitimately high or just an aberration.
My problem with your analysis is that I don't see that as a useful goal. You don't have to be close to true randomness for your deck to be functionally randomized. I think you significantly overestimate how random a deck actually needs to be.
(Additionally and somewhat tangentially, keep in mind that generally when shuffling your deck is already somewhat randomized.)
Indeed, this is precisely what you would need to do to complete your argument.
My basic point with the examples was that if you can't tell the difference between a deck shuffled one way and a deck shuffled another way, there is no functional difference between the shuffling methods. And yes, manually looking is not a good way to tell the difference.
Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
Well I'll keep digging around for a program out there that would count up rising sequences in a set of data. Having to do it by hand is extremely tedious and error-prone.
I understand what you're driving at. You're saying that if "mostly-random" gets you the same results during gameplay as "perfectly random," then mostly random is acceptable, because at the end of the day all we care about is the gameplay (that is, "functional") results.
And you're right: in the short run, you probably won't notice the difference. In the short run you might not even be able to calculate the difference based on the sample, just like, if I were to answer your question, in which you only gave me five hands, I could only give you a best-guess based on the "most likely" answer, and even if I'm 80% sure I'm right, I'm still wrong 20% of the time.
Statistics, though, is about the long run (summarized in the law of large numbers). As you deal more and more hands, like hundreds of hands, from a perfectly-randomized deck and a mostly-randomized deck (as your sample size increases), the difference in gameplay does in fact become statistically significant--moreover even noticeable to the untrained observer.
That's why the house edge in a casino on games like blackjack is often around only 00.20% to 00.50% (less than half a percent), but the casino still makes money, because as millions of people play millions of hands, the expected event--casino profits, players lose--becomes the actual event, or, rather, the actual events trend toward the expected event.
In the same vein, if you've been shuffling only one or two times your whole life and you played one hand on the MTGO shuffler, which is perfectly random, would you notice the difference? Probably not. What about fifty hands? Maybe not. Hundred hands? Thousand hands? Suddenly you find the MTGO shuffler is doing it differently than what you're used to.
That's what I mean when I say (with complete respect) that you're looking at it from the wrong perspective. Statistics as a field is about the long run--we even define "probability" as the "long-run relative frequency" of an event.
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
The key question is how many games will I need to play before I see this effect.
For example, if using a different shuffling technique makes a difference 1 in every 10 billion games, I'll likely never play a game where it does.
Even if the number is much lower, I may be okay with it making a difference occasionally if in return I gain the benefit of spending much less time shuffling.
Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
I have destroyed more sleeves mash shuffling than I can even imagine. And I have seen decks worth thousands of dollars get noticeably bent by riffle shuffling.
I pile shuffle and then casino riffle shuffle (you riffle the edges together and then carefully merge the piles) to go the easiest I can on my deck. I care far less about the probability of randomization than damaging a $500 deck.
Also, not to get into the ethics of the whole thing, but "completely random" isn't the most desirable state a deck can be in. Not even close. It doesn't really matter how randomized a deck is, sometimes it will be less random in your favor and sometimes it will be less random not in your favor. Randomization does not guarantee you any particular outcome, so who really cares how random your deck is? "Sufficiently random to avoid cheating" is the only threshold you need to worry about, and you certainly don't need to riffle shuffle 8 times to do that.
I admire your continued skepticism!
Generally even the smallest probabilities tend to make a significant (even observable) difference once the sample size reaches several hundreds or a few thousands. I'm sure you personally draw 7 after shuffling at least several hundred times in your career, no less all the other MTG players in the world, and together it quickly adds up to hundreds of thousands of samples. The number is certainly not in the billions.
Observe, as a simple example, the results of rolling a die, for which the expected value is easy to calculate: 3.5 (add up 1 + 2 + ... + 6 and then divide the sum by 6, since in a fair die each face has an equal chance of coming up). As shown in this graph below, by around 500, 600, 700 trials the average of the values of each die roll trend toward the expected value of 3.5:
If we used a weighted, unfair die, such that the probability of rolling a 5 or a 6 was slightly higher...let's say you had a 0.25 chance to roll a 5 or a 6 and only a 0.1666 chance to roll a 1, 2, 3, or 4, then expected value would turn out to be 4.416 instead of 3.5...and if you rolled this "unfair die" several hundred times, you'd clearly notice the die rolls approaching a different average as you rolled the die a few hundred times (the line graph would be up by 4.5, not 3.5).
A good riffle/mash can be performed in just a few seconds. The time argument is moot.
Re: Valarin...
This is a consideration most certainly, but there are ways to deal with this. Double sleeve, edge riffle, what have you. Better to find a way to make riffling/mashing work than to give up on them altogether and fall back to pile shuffling.
If you're edge riffling 7 times, then you're golden. It's still riffling.
Well, you're getting into the ethics of the whole thing. As I said in my disclaimer, my thread is to instruct people on what "true randomness" really means and how to achieve it. It is not the "optimal" way to prepare a deck if your only goal is winning; mana-weaving, "double nickeling" as Flores describes, etc. will always give you a competitive edge. If you don't want to achieve true randomness, or if you think you can get away without true randomness and not get caught with an "insufficient deck shuffling penalty" (see below), that's your decision, not mine.
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
Piling repeatedly is unacceptable for randomization purposes compared to mashing or riffling.
If I'm playing against people and they mash/riffle at least 7 times I generally just cut once at an arbitrary spot in the middle region of the deck. If I see you pile repeatedly, I'm going to mash shuffle your deck a few times and then cut it.
This isn't a valid argument. The die rolling example you give has no relevance to what we're talking about. There's nothing magical about a sample size of a few hundred. Depending on the math of the two shuffling methods we're considering, the needed sample size to expect to notice a difference could be much, much bigger.
A few seconds multiplied by a large number of shuffles is quite a bit of time.
Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
The die-rolling example goes to show you that it doesn't take an exceedingly large sample size (like in the billions) to begin to observe the effects of the Law of Large Numbers. It usually relatively soon, in the hundreds or thousands, whether you're rolling a die or flipping a coin.
But here's a more concrete example: there is no mathematics to determine when you the observer would notice a difference. There are, however, mathematics to determine when the difference becomes statistically significant--that is, when the probability that an observed difference is actually a difference and not a fluke. This is summarized in two related concepts called "statistical power" and "statistical sensitivity" (these are complicated to calculate; computers do it for us). Power increases with sample size; statisticians often endeavor to calculate how many "flukes" you'd have to see before the pattern becomes mathematically justified.
When you reach 0.80 power, it's safe to say that whatever pattern you're seeing is actually a pattern, a noticeable difference.
What these two charts are doing is calculating the sample size it would achieve to reach a power value of 0.80, the number widely accepted by the research world as the ideal power value.
In order to note a difference, we need to have a starting point. It doesn't really matter what that starting point is. Let's say for this example that you normally win or lose about half your games, so our null proportion is 0.5. This is what we're trying to disprove.
Let's say, because we're skeptics, that after starting your new, more random shuffling method, your win/loss increased (or decreased, doesn't really matter) by barely anything, just 1%. Thus our "test proportion" is 0.51.
The calculator says that we would receive 0.80 power with after 20,000 games. That's quite a lot of games, you say! More than you'll ever play in your life! True, but think of all magic players everywhere?
Still, let's move further; what if, after starting your new, more random shuffling method, your win/loss increased or decreased by 2% instead of 1%? Still barely anything. Thus our "test proportion" is 0.52. As you can see, 0.80 power comes much sooner, around only 5000 games.
Whether you want to look at it visually on a graph or conceptually by the numbers, the fact remains that we're not talking about sample sizes in the hundred thousands, millions, or billions, here, even for exceedingly small differences. Moreover, 1% and 2% represent a large number of games in the long run as you play more and more.
The number is 7. Seven riffles. Eight if you want to be really safe. That's all it takes.
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
Here are your sequences, with cards number 1-20 lands and cards number 21-60 spells: The number of runs in the sequences are respectively 37, 32, 31, 24, and 23. The expected number of runs is 2*(800/60)+1 = 27.7 and the standard deviation is 3.4. The corresponding 2-tailed p-values for the sequences are 0.006, 0.20, 0.34, 0.28, and 0.17. (See this page for information on the runs test.)
I consider this conclusive evidence that the first sequence was not produced by a random process. And I think it is self-evident that having more runs of lands and spells (i.e. less clumping) is preferable in this game. So not only did you fail to randomize the first deck, you could actually have gained an advantage by doing so.
How did you get the runs to use in the test? I was trying to count the rising sequences by hand to use in the probability calculation but that would've taken me forever.
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
Then I saw the light, so to count the runs, I put them into a file just like you see in the quote, and then I did: EDIT: Typo
EDIT 2: My rising sequence counts are 34,30,32,29,28.
Thank you, TheLizard, for doing what I could not. If it weren't the weekend I'd have SAS at my fingertips, and SAS has a sequencing command, but unfortunately I was left to pen and paper, and that's just ugh.
As a further note, I'd just like to point out that this is what I mean when I say that randomizing isn't necessarily "optimal." To put what TheLizard was saying into context, Fnord's randomized decks had fewer runs and therefore larger "clumps" of lands and spells, which means he'd actually run into mana screw more frequently playing a fully randomized deck than his nonrandomized one. This is why people complain about the MTGO shuffler, and this is why you can't trust your gut when something "seems random."
Legacy: GWR Enchantress <--That's my banner! (lol tinypic removed it)
Casual: WB [[Primer]]Clerics Tribal; BU Affinity
EDH: ...U [[Primer]]Arcum Dagsson; BG Legal Stax; B Illegal Stax
Proxy: .WX TriniStax
Other stuff: [[Official]]Shuffling, Truth + Maths
It often damage sleeves so you figure it out whether it would damage cards or not.
My response.
No way in HELL do I ever rifle shuffle my cards or let anybody rifle shuffle my cards for me. I'll sooner concede the match before I allow it. Magic cards are too damn expensive and rifle shuffling takes a terrible toll on them. So it ain't happening.
Here is how I shuffle my deck and I find it is more than random enough for my purposes.
Step 1 - I mash shuffle about 3 or 4 times after doing an overhand shuffle about 6 to 8 times.
Step 2 - I then pile shuffle the cards into 8 piles.
Step 3 - I pick up the piles, 2 at a time chosen in somewhat random order, and then overhand shuffle those together, taking the next 2, shuffling those together and them combining with the previous 2 and shuffling those together and continue this until all 8 piles are shuffled together.
Step 4 - I then take the deck and repeat step 1.
Does it take a little longer than 8 rifle shuffles? Sure. Is it just as random. I am quite certain it is going by my results. Is it much less wear and tear on my cards? Absolutely.
And ultimately, that's all I care about even if I have to take an extra minute or 2 shuffling my cards.