Shuffling

Pop quiz, hotshot. How would you write code to shuffle a deck of cards?

I was thinking about this after reading Mike's card-shuffling algorithm woes:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2007/12/shuffling.html

I had to randomly shuffle some data once for machine learning testing.

That’s when I discovered the STL’s random_shuffle method. Works pretty well.

In Java:

java.util.Collections.shuffle(cardList);

http://java.sun.com/javase/6/docs/api/java/util/Collections.html#shuffle(java.util.List)

No Swap built in? Don’t cry, create! :slight_smile:
Seems like an case for extension methods:

public static void SwapT(this T[] vaData, int vnJ, int vnI)
{
T lnTemp = vaData[vnJ];
vaData[vnJ] = vaData[vnI];
vaData[vnI] = lnTemp;
}

cards.Swap(i, n);

Shuffling 52 cards randomly requires quite a specialised random number generator, whether you use the Fisher-Yates method or the sorting method.

There are 52! (52 factorial) ways of arranging the cards, and 52! is a very large number - bigger than 2^225. Any random number generator used to generate fair shuffles must thus have an internal state of at least 226 bits to randomly generate shuffles. If it only has 32 bits (or even 128 bits), you can only hope to generate a minute fraction of the possible shuffles.

I’m not sure what properties Guid.NewGuid() has, but since there are only 2^122 GUIDs, it seems unlikely that it has more than 225 bits of internal state - can someone confirm or refute this?

A really good random generator has NO state.
Or does have radioactive decay any state?
Only pseudo-generators need to mantain a state.

In most implementations, however, random_shuffle uses the srand() seed, unless you provide it a RNG object of your own. And in both of these cases, your randomizer is only as good as the seed you give it. Too many people still use time() for a seed, which renders random_shuffle as useless as the randomize() example above.

“I’d assume most online casinos have hired competent cryptographers and statisticians by now”

Err … excuse me ? You “assume” that a given programming problem is solved competently by expert programmers … and “most” ? Do you have ANY actual real life experience (I know you do, just saying) ?

No matter what algorithm you use a very important thing to keep in mind is to keep that implementation abstracted away. So, no matter how you do it, the programmer should do nothing more than call a shuffle function (that doesn’t allude to a specific algorithm).

In Python:

import random
shuffled_deck = random.shuffle(range(52))

Fair point - if you have random numbers straight from radioactive decay then you’re sorted. Most applications can’t justify the hardware expense though.

I was unfamiliar with the term GUID so I looked it up on wikipedia, but even there I didn’t find any reason why it should be more random than a randomly generated number. For all I know, the system could just count upwards from 000…0001 for the GUID, or am I wrong?

Care to elaborate?

“I’d assume most online casinos have hired competent cryptographers and statisticians by now”

I assume that because there’s a lot of money at stake for online casinos, unlike most programming jobs. And not just the “company’s” money, which is somewhat disposable, but real people who get quite upset when you steal it from them.

Oh please, when has involvement of money ever made programs better ?

Same Ruby version I wrote at the original article:

cards.sort_by { rand }

Nice and simple.

I am surprised that many developers don’t know that the pseudo random number generators that come with most programming languages - especially with the popular ones are weak. For the needs of cryptography you need to roll your own. It would be a good idea to get a good source of randomness - radioactive decay or some similar method is a good source but a bit non-practical unless you have a huge budget. Some encryption software make you type random characters or move a mouse on the screen making a human source of randomness. Again not perfect but better than nothing. This is a whole long topic on which a lot of computer science papers are based.

reminds me of a project I was asked to write for a job interview. It was actually, to simulate a slot machine.

I was the only one to simulate machine behavior; everyone else just randomized.

Same sort of thing here… you’re just randomizing a list/array/whatever. it doesn’t simulate real world shuffling. That may be better than real world, or maybe not.

/just sayin’

I don’t think the sorting approach is necessarily better than the swapping approach! It’s more elegant but less performant (which matters for many applications).

You’re still depending on a pseudo-random generator (only new it’s called Guid not Random) so problem #1 is still there. Problem #2 is mitigated, but the performance is O(n log(n)) vs. O(n) with the swapping approach.

I wrote a program for a trading card game that allows the user to “simulate” their opening hand for the deck they created. While it isn’t as critical to get it purely randomized, using the seeded random generator to grab a card from the deck and put it in a new one works well.

But the really interesting thing is that when kids (and adults) play this game in real life, they never shuffle like this. Most of the time, they like to “pile shuffle”. Assuming you’re using 8 “piles” you take the top card of the deck and put it in pile 1, the second in pile 2, and so on. When you go through all your cards (minimum of 40, no maximum) you then grab your piles at random and you have your deck again.

And there are other techniques too. There’s your “rifle” shuffle where you split the deck in half and fan the cards together. There’s the “clump” shuffle where you grab a clump of cards from the middle to back of your deck and bring them to the front.

I’m still working on making various shuffle algorithms into the program, but to some extent, the players want the deck randomized, but not too randomized, because after a round they have some of their better cards clumped together, and they want to keep it that way.

I wonder if casinos would ever use various styles in their shuffler, since most of it is not done manually anymore.

Is a hand-shuffled and cut deck of actual playing cards really randomly sorted?

No swap in C++? What about std::swap in algorithm?