Lucky d20

What with talk of killer heat waves, droughts, floods, etc. etc., this blog tends to get pretty serious. When it does, we don’t deal with happy prospects, but with the danger of worldwide catastrophe. But every now and then we need to “lighten up,” so let’s have a little fun.

Recently a reader comment pointed to a website reporting the results of testing dice for fairness. Specifically, it tested the “d20” or 20-sided die. It’s a die often used in tabletop games, especially D&D (Dungeons & Dragons). That site links to yet another site which tests dice (specifically, the d20). They make enough of their data available for us to take a close look.

The first website (by Mark Fickett) looks at many dice, but the one for which I retrieved data was a d20 from Chessex, which happens to be purple. I’ll call it “Chessex Purple.” The other site (by “Awesome Dice”) compares a d20 from Chessex (which I’ll just call “Chessex”) to a d20 from Game Science (which I’ll call “Game Science” — imaginative names, eh?).

The Chessex Purple d20 was rolled 8,300 times, the Chessex and Game Science d20s 10,000 times each. They conclude that none of the dice is “fair” in the sense of having equal probability for all results (from 1 to 20), and they’re right. I did a chi-square test of the counts, and for all three dice the p-value was less than 10-15; they’re not “fair.” But that doesn’t mean they’re lucky or unlucky — if their departure from uniform probabilty favors big numbers (which is what you usually want in D&D), they’re lucky.

The “nominal” probability of any give result would be 0.05 (5%, or 1/20) for a fair die. The estimated probabilities based on their tests look like this:


The error bars are 95% confidence intervals (based on a Bayesian analysis). Clearly, some of the results are more or less likely than they “should be” for a fair die. For example, the Chessex Purple (plotted in purple) has a higher-than-normal chance (6.4% rather than the nominal 5%) of rolling a “17” (a pretty good roll) and lower-than-normal chance (3.8%) of a “13”, the Chessex (plotted in black) a higher-than-normal chance (6.2%) of rolling a “6” and lower-than-normal (3.8%) for a “19”, and the Game Science has a higher-than-normal chance (6.0%) of an “18” and a much-lower-than-normal chance (a mere 2.95%) of a “14”.

I got the impression that they rated the dice lucky or unlucky by their average rolls. For the Chessex Purple that’s 10.41 (+/- 0.13, 95% CI), for the Chessex it’s 10.49 (+/- 0.11, 95% CI), and for Game Science 10.36 (+/- 0.12, 95% CI). The mean for a fair die is 10.5, and both Chessex Purple and Chessex include that value in the error range for their averages. By that measure, they’re at least “fair” in that your expected roll is — as near as we can tell — the same as for a truly “fair” die.

But the Game Science d20 looks unlucky; its average is below that of a fair die. However, the way I reckon things, if I were playing D&D that’s the die I would choose.

In D&D a d20 gets used a lot, and when it does, you almost always need to roll some fixed value or higher. When attacking a character or monster, you have an ability to hit and it has an armor class (its ability not to get hit), and your roll combined with your ability has to meet or exceed a required limit — the “to hit” value — for you to succeed. So, you want to roll the “to hit” or higher. If you need a 16 to hit that orc, you succeed on a 16, 17, 18, 19, or 20 but you miss on anything less. Likewise, for a “saving throw” you have to make your saving throw requirement or higher to succeed, anything less is failure.

There are also complications for extreme rolls. If you roll a “20” it’s a “critical hit” and you just might do extra damage. If you only roll a “1” you screwed up — you might drop your sword. Generally, 20’s are extra good, 1’s are extra bad.

I used the given data to estimate the probabiltiy, for each possibility from 1 to 20, of rolling that or higher. Actually I computed the “excess probability”, i.e. how much more-than-normal chance you get. The chance of a “1” or higher is 100%, so its excess probability is zero … rolling at least 1 is a sure thing, so you won’t do better or worse than a sure thing. But for other numbers, the excess probabilities depend on the die. I estimate them thus:


If you need a “3” or higher to hit, the Chessex is your best bet; your success probability is 2% higher than normal (92% instead of 90%). If you need a “12” or higher to hit, the Game Science d20 is your worst bet, with 2% less probability than normal (43% instead of 45%). So for low to medium “to hit” limits, the Chessex is better than the Game Science.

But those are the “easy” rolls. If all you need is a “3” or higher to hit, it’s probably some scurrying rat that can’t do you much damage even if it bites you, you’re going to win this one eventually. Even if you need a “12” to hit, you probably went into this knowing it needs an “18” to hit you, and you hit harder anyway, sooner or later it’s toast.

But if you need a big number to hit — if you need an “18” or above — that’s when you’re fighting the tough stuff. That’s when your chance to hit is low enough, that the battle could take some time, you could sustain some serious damage, you might even get killed. That’s when being “lucky” really counts.

And, that’s when the Game Science d20 is better. It has a better-than-normal chance of an “18” or above (16% instead of 15%, a full percentage point extra). And if you need a “19” or above, or — gods forbid — you need a “20” just to hit, it has at least the “normal” chance, while the Chessex Purple and Chessex give you a less-than-normal chance. Just when it really counts.

The Chessex die does have one virtue: it’s chance of rolling a “1” (not “1”-or-higher) is lower-than-normal, so you’re less likely to screw up.

Anyway, if I’m doing the D&D thing and have to choose one of those three dice, I’ll take the Game Science.

That’s just for these particular dice, mind you. Maybe other dice from Game Science aren’t so lucky, maybe some dice from Chessex are killer good. I can’t say without data … and unlike the aforementioned bloggers, I don’t have a machine to automatically roll them thousands of times and test them for me.

The upshot is, that if you want to know which one is the “lucky” choice, be sure to consider, carefully, just which results are lucky and which aren’t.

This blog is made possible by readers like you; join others by donating at Peaseblossom’s Closet.

10 responses to “Lucky d20

  1. rhymeswithgoalie

    If the die are molded imperfectly in plastic before being printed, then any bias for a particular d20 is unique to that molding+printing combination. Dr. Sheldon Cooper can labor mightily to select which d20s of a large production run have the desired bias (or none at all).

    Alternatively, the physical weight of the individual printed numbers might put in a standard bias.

    The test for whether bias is spread evenly across a production run, dump (say) 10,000 d20s and see whether the probabilities of all numbers are the same.

  2. By co-incidence, I played a skirmish war-game called “Frostgrave” on Friday, which uses d20s. My dice rolls were absolutely abysmal, even my opponent agreeing the dice gods were against me. It was so bad I’ve commented I’d have done better using a d12. So I was amused to see this post today, as I head for another gaming session with a different mate, and its given me the perfect ‘reason” of why I am going to lose horribly and its nothing to do with my lack of tactical skill.

    Also worth noting, that people often comment that AGW is loading the dice when it comes to more extreme events, so very much in keeping with this blog’s main purpose.

  3. Dave Werth

    Did they only roll one die of each type? Seems to me you’d need to test a number of each type to see how much variability there is within each type of die to really make a statement about which is better or worse.

  4. When I studied statistics in college, me and my friends were avid D&D players. Following, the lesson on Chi2 test, my friends brought their dices in course and manage to test them in class to check if they were lucky or not. At some point, the professor noted some abnormal activity in the back of the room. My friends explained they were checking their dices. The professor was impressed by the existence of such weird dices that she did not made any further comment.

  5. You reveal your mispent youth playing D&D :-)

    rhymeswithgoalie said: “Alternatively, the physical weight of the individual printed numbers might put in a standard bias.”

    My D&D dice have embossed numbers, so there is less plastic on the ’18’ side than the ‘1’ side. I guess this does add a small (consistent for a given design) bias, as the difference is right at the edge where it will have the most effect.

  6. Sorry for the off topic post but I did not see a direct contact link.

    You may be interested in this link from the Atlantic as it contains many probability claims regarding human extinction (several of which seem wildly high to the less than highly trained eye).

    My question is, “Are the numbers being used in this article at all soundly calculated?”

    Thanks and regards.

  7. Reblogged this on Hypergeometric and commented:
    Careful consideration to really basic things like this is, for me, incredibly refreshing, and helps with the self-discipline needed to deal with real-world problems, those often being messy and having distracting entanglements.

  8. The Chessex Purple d20 was rolled 8,300 times, the Chessex and Game Science d20s 10,000 times each.

    Were the researchers who accomplished this Herculean task named Leonard, Howard and Sheldon by any chance?

    [Response: What??? No mention or Rajesh?]

  9. Nice observation. I did an interesting observation with regards to the popular board game The Settlers where you place houses (and roads) next to hexagons which serve as resources that can be harvested based on the rolling of two 6 sided dice. Each hex tile has a number associated with it from 2 to 12, and as we all know the normal distribution is the classic bell curve with 7 the most likely sum of two dice. Well thats all good and dandy, but any single game of The Settlers have so few rolls before the game has ended that the spread of 2x6D is all over the place, so its very easy to feel “unlucky” in this game even though your mind thinks you are doing the wise thing placing houses next to the resource hexagons with 6 and 8 on them (the 7 is actually used for moving the thief to block harvesting on any particular tile). In the long run though averaging over many games you will get more resources by building by the 6 and 8.

    • When I was young and foolish, I tried my hand at “dungeon master”, building my own milieu with attendant monsters and treasure. Naturally, the tactical aspects were carefully considered–but when trying out the combat characteristics of the monsters, I found myself, somehow, without any d20. So I tried to synthesize the rolls, using (IIRC) coin flips.

      Thought I’d succeeded, until I tried things out with actual players. It was a cakewalk. Booooring!

      Moral: you may correctly synthesize the range of the data, but if the distribution is off, you blew it anyway.

      Who says D& D doesn’t teach anything practical?