RolePlay onLine RPoL Logo

, welcome to Technical Discussions

15:21, 28th March 2024 (GMT+0)

Dice roller has a problem.

Posted by Rathmun
Gaffer
member, 1721 posts
Ocoee FL
45 yrs of RPGs
Mon 31 Jan 2022
at 04:26
  • msg #25

Re: Dice roller has a problem.

So what can a GM do about this?

Just use the dice roller, faults and all.
Use the dice roller and figure some way to adjust for its flaws.
GM and players use some other online dice roller and report their results.
GM and players roll real dice at home and report their results.
Find a site with a more precise dice roller and play there.

Did I miss any possibilities?
Rathmun
member, 15 posts
Mon 31 Jan 2022
at 04:32
  • msg #26

Re: Dice roller has a problem.

Gaffer:
Did I miss any possibilities?

Bring it to the dev's attention with a thread like this?  :)


evileeyore:
Rathmun:
The problem with this sort of thinking is that it's still suceptable to streaky RNG.

Granted, however as the results can be recorded, that can be watched for.

I do believe that's exactly what Jaberwok and I did.  We noticed a streak of low rolling in two different games, and have reported it.
This message was last edited by the user at 04:33, Mon 31 Jan 2022.
Skald
moderator, 957 posts
Whatever it is,
I'm against it
Mon 31 Jan 2022
at 06:11

Re: Dice roller has a problem.

Oooooh I used to hate probability at school/uni.  Give me algebra any old day.  OR Sir Terry Pratchett's "million to one shots come up nine times out of ten" ! <grins>

Rathmun - I had a quick look at the die roller for the game in question ...

Looking at your later test samples ... if my maths (aided and abetted by Excel's convert text to columns functionality) is right, then:

a) your 20 x 100D6 rolls was 2000 rolls for 664 successes, and you'd expect 667 (5 or 6 comes up on 1D6 2 in 6 = 1 in 3 times)
b) your 41 x 10D6 rolls was 410 rolls for 148 successes, and you'd expect 137

And I ran a quick test in one of my own games, just for the fun of it:

c) my 10 x 100D6 rolls was 1000 rolls for 331 successes, and you'd expect 333

Those look close enough to the expected results to me ... with your earlier mileage less (admittedly quite a bit less) than expected successes, perhaps the Dice Gods had turned their faces from you (and the rest of your team) ?   :P

I look on it more as a confirmation of the laws of probability - sure if you toss a coin 100 times you should get roughly equal heads and tails, but in theory you can get 100 heads and 0 tails.   Similarly the odds of someone winning the jackpot in the lottery should be so low as to never come up during their lifetime ... but people do win it or so I'm told.    By the same token - it is possible to get a run of low, or high, random rolls.

Though I'll bow to the wisdom of those who know more about the mathematics of probability than I.   :>
NowhereMan
member, 469 posts
Mon 31 Jan 2022
at 06:16
  • msg #28

Re: Dice roller has a problem.

These are true gamers all right. Have a run of bad luck? Blame the (virtual) dice! ;)
Rathmun
member, 16 posts
Mon 31 Jan 2022
at 06:26
  • msg #29

Re: Dice roller has a problem.

Skald:
Oooooh I used to hate probability at school/uni.  Give me algebra any old day.  OR Sir Terry Pratchett's "million to one shots come up nine times out of ten" ! <grins>

Rathmun - I had a quick look at the die roller for the game in question ...

Looking at your later test samples ... if my maths (aided and abetted by Excel's convert text to columns functionality) is right, then:

a) your 20 x 100D6 rolls was 2000 rolls for 664 successes, and you'd expect 667 (5 or 6 comes up on 1D6 2 in 6 = 1 in 3 times)
b) your 41 x 10D6 rolls was 410 rolls for 148 successes, and you'd expect 137

And I ran a quick test in one of my own games, just for the fun of it:

c) my 10 x 100D6 rolls was 1000 rolls for 331 successes, and you'd expect 333

Those look close enough to the expected results to me ... with your earlier mileage less (admittedly quite a bit less) than expected successes, perhaps the Dice Gods had turned their faces from you (and the rest of your team) ?   :P

I look on it more as a confirmation of the laws of probability - sure if you toss a coin 100 times you should get roughly equal heads and tails, but in theory you can get 100 heads and 0 tails.   Similarly the odds of someone winning the jackpot in the lottery should be so low as to never come up during their lifetime ... but people do win it or so I'm told.    By the same token - it is possible to get a run of low, or high, random rolls.

Though I'll bow to the wisdom of those who know more about the mathematics of probability than I.   :>


Thank you for moving this to the correct location.

I did note that on the later rolls as well.  But that does still leave a rather streaky dice roller.  3.76 sigma low for 300 dice is a streak so bad you wouldn't expect to see it more than once in 30,000,000 rolls.  So there's a lot more dice to roll before we can show that a streak that bad is not indicative of a problem.  A couple dozen million more dice in fact..  And we can't just clump those all together, we need them recorded so that we can take the average of a sliding window to see how far away from normal it is at any point along that 30 million rolls.

In the long term the roller might average out to normal, but as noted before, it needs to not wander wildly off into the weeds along the way either.



NowhereMan:
These are true gamers all right. Have a run of bad luck? Blame the (virtual) dice! ;)

Actually, the set of rolls that really slapped me in the face, I was rolling for both sides.  I was trying to break through a wall, and the GM had me rolling the wall's damage resistance too.  So bad luck for the wall was good luck for me.  From that perspective it's both good and bad luck taken to extremes.
This message was last edited by the user at 06:50, Mon 31 Jan 2022.
Westwind
member, 88 posts
"[Sad] is happy for deep
people" - Sally Sparrow
Thu 3 Feb 2022
at 13:08
  • msg #30

Re: Dice roller has a problem.

I'm assuming that the person doing the "fairness calculation" for any game is also the game owner / DM. Otherwise one runs into the complications of fudged and hidden rolls skewing the results.

What is more important in my mind, though, is whether the deviation changes significantly depending on who is rolling. As long as everyone is playing with roughly the same deviations, given a large enough sample set, then the dice roller is fair.

Don't sacrifice the good while searching for perfect.
SunRuanEr
subscriber, 437 posts
Thu 3 Feb 2022
at 13:13
  • msg #31

Re: Dice roller has a problem.

Westwind:
What is more important in my mind, though, is whether the deviation changes significantly depending on who is rolling. As long as everyone is playing with roughly the same deviations, given a large enough sample set, then the dice roller is fair.

Don't sacrifice the good while searching for perfect.

This is pretty much my take.

The problem with anything involving 'probability' is that it's not perfect. It can't be perfect, because there's always chance involved. Sure, you can say what a roll set *SHOULD* be, or what the average of a set of rolls *SHOULD* be, or whatever...but that doesn't mean it's what it *IS*. It doesn't even mean the dieroller isn't working right. It just means that on those rolls, chance was involved.

...which is the one *SHOULD* that we can count on. =)
Carakav
member, 693 posts
Sure-footed paragon
of forthright dude.
Thu 3 Feb 2022
at 13:28
  • msg #32

Re: Dice roller has a problem.

Or if the probability shifts by die-type, which can impact specific players depending on character builds they might favor.
Rathmun
member, 17 posts
Thu 3 Feb 2022
at 19:01
  • msg #33

Re: Dice roller has a problem.

Westwind:
then the dice roller is fair.

No.  No it is not.

Let's take the most common system, the one everyone probably knows, even if they've only played it once, D&D.

in combat a skewed dice roller can be fair only as long as there are no spellcasters.  Because a spellcaster often makes other people do all the rolling.  After all, you roll to save vs their spells, and you roll to hit them.  They don't roll for either of those.  If the dice roller trends high, the spellcaster is at a disadvantage, because people are more likely to hit them and more likely to make their saves.  If the dice roller trends low, they get an advantage, for exactly the inverse reason.  People fail their saves and miss their attacks more often than they should.

Or take the Cypher system that someone else pointed out above.  The players do all the rolling, players roll to hit their enemies, and they roll to avoid getting hit by them.  If the dice roll high, the players have an advantage, if they roll low, they're at a disadvantage.


There are math tricks you can do with something like an unfair coin to get a fair result, paired flips for example, even if you have a coin that gets heads 90% of the time, that means that for independent pairs of flips you have heads-heads being 81%, tails-tails being 1%, heads-tails being 9%, and tails-heads being 9%.  With that knowledge you can get a fair flip, though you do have to make sure they're independent pairs, because heads-heads-tails is going to be far more common than tails-tails-heads.  HOWEVER, you can't do that with a computerized dice roller.  There may be tricks with physical dice, but the weighting of a physical die can't change from roll to roll.  With a PRNG that's streaky, it can.
drewalt
subscriber, 121 posts
Thu 3 Feb 2022
at 22:49
  • msg #34

Re: Dice roller has a problem.

I deeply suspect we'd need a statistician and a software engineer (or teams thereof) to really address this topic.

I have to generate random numbers for testing purposes sometimes, and I remember a long time ago we were actually just using the "RAND" function in Excel, but the scuttlebutt came down forever ago from several professional associations/groups/entities that this was this was not random enough in the resulting distribution.

The industry, at great expense, began to invest in very expensive software packages to generate random numbers instead of using free random generators, functions built into common programs, etc.

Someone a lot smarter than me explained that getting a truly random seed using a computer was actually a nightmarishly complex problem using a lot of words I don't remember and don't understand.  Apparently if you type a random function into a random software you typically get something "random enough" but not random enough for people who need statistical rigor.  I can't explain why this is, it's just been pounded into my dense skull many times by people much smarter than I am.

The point being, I suspect whatever the solution is the site is using under the hood isn't a statistical software package with a license fee that probably costs handsomely.  But I also suspect the die roller is many orders of magnitude more "random" than just asking a disinterested person on the street to pick a number between 1 and 20.  After all you have to consider the purpose of the site, something which will more or less work and costs very little or has no additional cost at least is infinitely better than any solution that involves more cost and complexity.  No one is going to court over losing DnD.  I hope.

Although I've seen the die roller do some very wacky things, I know just enough about probability to understand that it's very, very easy to outsmart yourself and convince yourself that just because you were always pretty good at calculus means and actually bust out DeMorgan's laws sometimes to try to think through problems that you understand what's actually going on where probability is concerned.  There are actually court cases in multiple countries where people have been convicted based on the calculations of probabilities which were arithmetically correct, but based on invalid yet very reasonable and logical sounding assumptions applied by sharp and rational people.  People often miss assumptions they are making (and so do other people as people tend to think like people), and any assumption that isn't reduced to a sufficiently robust logical notation and tested in same is always suspect (and even then someone will argue the axioms are wrong).  Epistemology is interesting.

My point being, I think a reasonably intelligent (or in my case, significantly below average but unfunny and pretentious) person who is, alas, not a probability expert is more likely to fall into the trap of thinking they understand the die roller better than they actually do than someone who has no particular talent at all for calculation, so I'm very careful to not judge the quality of the die roller or gauge how random it truly is, no matter what the calculated probability of a certain set of events may be.  I'd suggest most of us probably fall into that bucket, unless you happen to be the PhD in statistics (there's got to be at a few of you out there on a site like this) reading this.

Check out sometime what a lottery machine costs.  Trying to get an analog way to get a random result isn't cheap either.  Randomness is truly a fascinating topic with many rabbit holes to go down.

To wrap it up, surely there's a better die roller than the one the site has which could be had (in principle).  But, it's a question of where do you get those resources from, and also if we had them is that truly the best use of them.  Also, no matter how high quality the random number generator, people still swear it's off (I see it at work all the time) because our monkey brains are so wired to see patterns they often force them.
This message was last edited by the user at 22:51, Thu 03 Feb 2022.
facemaker329
member, 7384 posts
Gaming for over 40
years, and counting!
Fri 4 Feb 2022
at 04:13
  • msg #35

Re: Dice roller has a problem.

My favorite example of random bad luck beating the odds--

When I was in college, we went to a theater festival in Las Vegas.  And, of course, we spent one evening at a casino playing slots (everyone else went for the free drinks provided to people playing the games...since I don't drink, I just went to be part of the company.)

I went through ten dollars' worth of quarters...with no payouts.  Everyone else had also started with ten dollars, and most had gotten that much back and more, some had even doubled their money.  Them being drunk, they couldn't believe that I hadn't won ANYTHING off ten dollars and that, statistically, the machine I was playing HAD to pay out at some point...so they started feeding me their winnings to keep me playing.

Not only did I not get anything back on my ten bucks--I spent EVERYTHING the rest of the group had won (I didn't take their 'seed money'), and STILL didn't win.

So, yeah...maybe you've got an argument.  But it's not unprecedented for a random generator to deliver an incredible long string of less-than-desirable results.  I still remember the D6 Star Wars group that I played in through college...we had one player whose character was a Mercenary.  She couldn't hit ANYTHING with a blaster...unless she was Stunned or Wounded.  It wasn't a quirk that the player came up with...it was just how the dice worked for her.  Full complement of dice to roll?  She consistently came up short of her target number.  Take away a die, and she was almost always well above her target number.

Sometimes the dice just do weird things.
Skald
moderator, 958 posts
Whatever it is,
I'm against it
Fri 4 Feb 2022
at 07:39

Re: Dice roller has a problem.

Some more info for those who understand the intricacies of such things ... have a look at the following thread I've just dug up that jase prepared earlier, cunningly entitled "Randomness and the die roller": link to a message in this forum

Specifically:

msg #1 - "the die roller utilises Perl and the /dev/urandom device on RPoL's Debian server. It is seeded once, automatically, by Perl."  Must be good.

msg #29 - jase addresses Entropy, Chi Square, Arithmetic Mean, Monte Carlo value for Pi and the Serial Correlation Coefficient in relation to the dice roller's results.   I hope those terms mean more to you than to me, but I think the upshot of his message was that it's as random as is sensibly possible !  :>
Westwind
member, 89 posts
"[Sad] is happy for deep
people" - Sally Sparrow
Fri 4 Feb 2022
at 23:20
  • msg #37

Re: Dice roller has a problem.

Rathmun:
Westwind:
then the dice roller is fair.

No.  No it is not.

It is because it affects healing rolls, damage rolls, saving throws, etc. for the players and the NPCs.

My brother has a knack for rolling natural 20s on a d20, far above average, no matter whose dice he's using.

My daughter wins ridiculously often at Yahtzee and Bingo.

Random isn't always random. Or, is randomly random.

My advice is to not worry about the fairness and simply enjoy the game.
Rathmun
member, 18 posts
Fri 4 Feb 2022
at 23:44
  • msg #38

Re: Dice roller has a problem.

Westwind:
It is because it affects healing rolls, damage rolls, saving throws, etc. for the players and the NPCs.


Did you even read the thread?  There are multiple examples provided where only one side or only the other are doing the rolling, where all low rolls favors one side, and all high rolls favors the other.  Hell, there's an entire damned system where that's the case (Cypher).




Embedded images don't work in this forum apparently, but here.
https://imgur.com/a/u6UneKs

This is a plot of the kurtosis (prevelance of outliers) for 30524d6.  This took quite a while to do all the rolling, and the data isn't perfectly clean because other people were using the roller at the same time I was, but please note the semi periodic spikes.  There is also a lot of noise on the graph, which you may think is desirable for an RNG, but you should remember that this is a plot of how many outliers there are.

I used a target number of 5 for the rolling, because I first noticed something going weird in a Shadowrun game.


I would love to have the perl code that takes /dev/urandom and turns it into dice.  I could set up a debian install in a VM and just get a million dice in a row without anyone else consuming some of the numbers.  But this is the best I can do at the moment with the tools I have.
This message was last edited by the user at 23:44, Fri 04 Feb 2022.
donsr
member, 2508 posts
Fri 4 Feb 2022
at 23:50
  • msg #39

Re: Dice roller has a problem.

the bottom line here? Everyone has to remember this isn't table top...its PBP. every sport in the word, has 'ground rules" for stadiums and venues.

 here, we have ( what i think) is one of the best Dice  rollers in PBP...load in your Numbers, and info if you need to and   click.

 IF..... you think the dice rooler is  too stilted? Then back offm making the  rolls so important. My games..yes..the dice roller  does come into play, but  the players  Mods  and RP factor in much more.

 If you have to  roll a half dozen times per post, You need a new system!
Rathmun
member, 19 posts
Sat 5 Feb 2022
at 00:02
  • msg #40

Re: Dice roller has a problem.

donsr:
If you have to  roll a half dozen times per post, You need a new system!

I usually don't.  It was an unusual case of "how long does it take you to break a hole in the wall?" that tipped me off to the issue initially.  As noted in the first post of the thread, both I and the wall were rolling terribly.  One-in-thirty-million bad luck.  That's when I took a look at all the rolls for that entire game, and found that everyone together was rolling terribly.



Claims that "It's fair as long as everyone rolls terribly" can piss off.  Unless those people have never used any systems with fixed difficulties, they're talking out their backside.  Skewed dice are unfair unless literally everyone is rolling, including the inanimate objects.  The only system I'm aware of that does that is Donjon.  (Incidentally, neat system, look it up.)
  • If you know the dice are skewed, then you have to have casters in D&D roll to figure out what the save DC of their spells is.  You can't have a fixed base of 10, because that assumes fair dice.
  • If you know the dice are skewed, you have to make everyone in your D&D game roll their AC, because not everyone relies on attack rolls.  So you have to have an armor save instead of an AC, to bring it in line with the other saves.
  • If you know the dice are skewed, you can't play World of Darkness or Shadowrun, becuase rolling ones without any successes has specific consequences, even on unopposed rolls.
  • If you know the dice are skewed, you have to change literally every roll in Cypher because only one side is rolling at all.


And sure, this might be one of the best dice rollers in PBP.  It might even be the best, I don't know.  But that doesn't make it perfect, and quite a few people's reaction so far consists of "You will eat it and you will like it.  Don't even THINK about asking if it can be improved."  Sorry, but no.
This message was last edited by the user at 01:47, Sat 05 Feb 2022.
Carakav
member, 694 posts
Sure-footed paragon
of forthright dude.
Sat 5 Feb 2022
at 00:06
  • msg #41

Re: Dice roller has a problem.

I think the point here is that an obviously technically proficient person (Rathmun) has uncovered something interesting and is offering to investigate/dig deeper to make the dice roller we have work better for everyone.

Admins (and Jase) are not under any obligation to accept their input, but it's still a healthy conversation to have, and it's pretty clear to me that this issue does objectively matter to a subset of site users.
JAM2019
member, 71 posts
Wed 2 Mar 2022
at 14:17
  • msg #42

Re: Dice roller has a problem.

while you guys have used some impressive mathematical aptitudes to express your findings, theories, and concerns why not simply ask the admin to look into this troubling phenomenon. I agree that the RNG is annoying as hell at times. RNGs have many tendencies that make them undesirable for many situations. I am sure that the moderators know this as well. so, simply ask them to look into. No demands. No you need to do this lines. Just ask.

RPOL has provided a great place for play by post games. I feel that they are sincere and honest with the people they provide services to. And the price is awesome! So please lose the vim and vigor in your posts. Pat these guys on the back. And simply ask them to look into it.
Rathmun
member, 58 posts
Sun 18 Jun 2023
at 09:55
  • msg #43

Dice roller has a problem.

I'm still interested in seeing the code for the dice roller.  The unanswered question about what's going on still bubbles to the surface of my mind occasionally and I want to track down an answer.

Either something's hinky with the roller, or something about it reveals a problem with dev/urand.  (Probably the former.  The latter has recieved a lot more scrutiny from experts.)
jase
admin, 3838 posts
Cogito, ergo procuro.
Carpe stultus!
Sat 4 Nov 2023
at 07:28

Dice roller has a problem.

Ooh I love talking about this!

Perl is generating the random number.  Actually Linux really is and Perl is handling the call and giving the result, returning a number between 0 and 1 (0 inclusive, 1 not).  This number is to 15 decimal places.  Or, thinking about it another way, a number between 1 and 1,000,000,000,000,000.  I had to look that up, it's 1 quadrillion.

Extrapolating on the 1-256 example - you are right, 6 (again, using the example) doesn't round into a quadrillion nicely, it'll be biased.

Same math - 166,666,666,666,666 x 6 = 999,999,999,999,996.  Ergo the last 4 will produced biased data.  Or about 0.0000000000004%.

Technically it's not the "last" 4 either.  With every random generator rounding or modulus has to happen somewhere along the way, so you'll get a distribution of rolls which is something like:
  • 1: 166,666,666,666,667 (+0.0000000000006%)
  • 2: 166,666,666,666,667 (+0.0000000000006%)
  • 3: 166,666,666,666,666
  • 4: 166,666,666,666,667 (+0.0000000000006%)
  • 5: 166,666,666,666,667 (+0.0000000000006%)
  • 6: 166,666,666,666,666

Btw 8, 10, 20, 100 don't have this issue as they round nicely.

Rathmun:
The problem with this sort of thinking is that it's still suceptable to streaky RNG.

If you have a coin that flips heads 100x in a row, followed by tails 100x in a row, and just repeats that pattern forever, I'd hardly call it a fair coin.  But if you flip it a million times, you'll still end up with 500000 heads and 500000 tails.

That's why the analyser also checks duplicate rolls (as well as walking rolls and deviations).  It's not just a pretty grid.

I do recall many moons ago one of the many complaints about the roller.  In this particular instance there were complaints it was rolling low all the time, which is not surprising as it's a recurring theme.  The story that I always tell at this point is I changed the random generator from "rand()" to "1 - rand()", which flips the results on their head.  The next complaint was.. how the dice roller is always low.

In the past we've used base rand(), 1 - rand(), Math::Random, modulus and even a random generator to randomly choose the random generator.. but they've all resulted in the same complaints.

I've generated 2 million rolls and run it through ent and it's shown no issues outside of the pseudo-random norms.  I took considerable time to create the dice analyser and since its introduction the dice roller has generated and tracked over 8 million rolls, of which the results have by and large been dismissed.

I've flipped and changed how we generate numbers and none of it has appeased anyone.  I've analysed results, and provided said analysis, but it still doesn't help.  I gave up messing with it some time ago and just resorted to Perl's well-documents and well-tested rand().

I have been disappointed in my own rolls here but I've come to realised that us humans are far more biased than the dice roller.  For that matter so are the physical dice we use, check out this where they compare 10,000 Perl rolls vs real dice… and Perl is far more consistent (TLDR - Perl's largest skew is +7.4%, readl dice - 41.0%).

I love a good conspiracy theory and are happy to make improvements as required but at this point there's nothing new to make me doubt that Linux's and Perl's random number generation is not good enough for our needs.  It's been widely audited and is considered more than adequate for anything short of cryptography.
Rathmun
member, 64 posts
Sat 4 Nov 2023
at 09:32
  • msg #45

Re: Dice roller has a problem.

The analysis I did isn't about rolling low, though that is what prompted me to start looking.  Instead I was looking at the prevalence of outliers, that's the Kurtosis plot.  With the way I did the calculation, that plot doesn't show whether the outliers were high or low, just that they existed.  So it wouldn't see any difference between rand() and 1-rand().

You mention the skew being fine, and that was something I looked at.  Indeed it looked fine, but that's not where I stopped.  Kurtosis is the standardized moment after skew.  Variance -> Skewness -> Kurtosis.  That's where I found something interesting.  There are streaks where outliers occur more than they should, by an order of magnitude.  The bell curve (I was testing with Shadowrun dice rules) is flatter than it should be, but only sometimes.  I don't think it's something the current analyzer page would catch.



A few other things

jase:
Technically it's not the "last" 4 either.  With every random generator rounding or modulus has to happen somewhere along the way, so you'll get a distribution of rolls which is something like:
  • 1: 166,666,666,666,667 (+0.0000000000006%)
  • 2: 166,666,666,666,667 (+0.0000000000006%)
  • 3: 166,666,666,666,666
  • 4: 166,666,666,666,667 (+0.0000000000006%)
  • 5: 166,666,666,666,667 (+0.0000000000006%)
  • 6: 166,666,666,666,666

This is solvable by throwing out the last four, though I agree they're not enough to make a difference at that scale.  What's more interesting is this.
www.fourmilab.ch/random/:
Chi-square Test

    The chi-square test is the most commonly used test for the randomness of data, and is extremely sensitive to errors in pseudorandom sequence generators. The chi-square distribution is calculated for the stream of bytes in the file and expressed as an absolute number and a percentage which indicates how frequently a truly random sequence would exceed the value calculated. We interpret the percentage as the degree to which the sequence tested is suspected of being non-random. If the percentage is greater than 99% or less than 1%, the sequence is almost certainly not random. If the percentage is between 99% and 95% or between 1% and 5%, the sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate the sequence is “almost suspect”. Note that our JPEG file, while very dense in information, is far from random as revealed by the chi-square test.

    Applying this test to the output of various pseudorandom sequence generators is interesting. The low-order 8 bits returned by the standard Unix rand() function, for example, yields:

        Chi square distribution for 500000 samples is 0.01, and randomly would exceed this value more than 99.99 percent of the times.

    While an improved generator [Park & Miller] reports:

        Chi square distribution for 500000 samples is 212.53, and randomly would exceed this value 97.53 percent of the times.

    Thus, the standard Unix generator (or at least the low-order bytes it returns) is unacceptably non-random, while the improved generator is much better but still sufficiently non-random to cause concern for demanding applications.


Using modulus puts a larger share of the burden of randomness on the low order bytes, especially for d4's and d8's, since the higher order bits don't affect those sizes at all.  Odd-sided dice would be least affected, since every bit affects the final modulus value for those, while even sided dice that aren't powers of two will be somewhere in between.

Now, I don't actually know if that failure of randomness mentioned by fourmilab is limited to the lowest byte, but if it is, there's some mileage to be gained by changing things up.  Reversing the endianness won't work though, one quadrillion only uses six and a quarter bytes, so the highest order byte only has two random bits.  (artifact of casting float to int?)




Finally, just want to say thanks for everything you do, creating and running this site.  It may not be perfect, but it's pretty great.
This message was last edited by the user at 09:37, Sat 04 Nov 2023.
jase
admin, 3839 posts
Cogito, ergo procuro.
Carpe stultus!
Sat 4 Nov 2023
at 10:57

Re: Dice roller has a problem.

It's been a while since I've played with it but from memory Ent makes an assumption that the data is in single bytes (0-255).  So you can't use it to analyse die rolls unless you record the core random number and munge it into a single byte character... which I did aaaaages ago (2005, according to this post):

output:
Entropy = 7.999965 bits per byte.

Optimum compression would reduce the size
of this 5000000 byte file by 0 percent.

Chi square distribution for 5000000 samples is 242.74, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4919 (127.5 = random).
Monte Carlo value for Pi is 3.143094057 (error 0.05 percent).
Serial correlation coefficient is 0.000065 (totally uncorrelated = 0.0).


Ent is/was interesting but it's also from 2008 so outdated.

Can't comment on your 30524d6 as I don't know enough detail about how you rolled and what you recorded, though I will muse that d6 with a target of 5; if there were a lot of running successes on that then that'd be picked up by the analyser's duplicate and walking analysis.



Edit to add I downloaded the current binary file and analysed it.  Had to analyse it as raw data so 0-99 (not 1-100).
quote:
rolls         : 8,262,100
mean          : 49.5086358189806
median        : 50
variance      : 833.419114541513
trimmed_mean  : 49.5078691858008
harmonic_mean : (failed)
geometric_mean: 0
chi squared   : (failed)
kurtosis      : -1.20060378739476
skewness      : -0.00008133670175

Some tests failed including chi square, the data was just too large for it.

https://metacpan.org/pod/Statistics::Descriptive was used as well as https://metacpan.org/dist/Stat...Statistics/Basic.pod as well as https://metacpan.org/pod/Statistics::ChiSquare (for all the good that did).

I'm not even sure what some of those mean but that's all the gory stats.
This message was last edited by the user at 18:05, Sat 04 Nov 2023.
Rathmun
member, 65 posts
Sat 4 Nov 2023
at 19:30
  • msg #47

Dice roller has a problem.

jase:
It's been a while since I've played with it but from memory Ent makes an assumption that the data is in single bytes (0-255).  So you can't use it to analyse die rolls unless you record the core random number and munge it into a single byte character... which I did aaaaages ago (2005, according to this post):

output:
Entropy = 7.999965 bits per byte.

Optimum compression would reduce the size
of this 5000000 byte file by 0 percent.

Chi square distribution for 5000000 samples is 242.74, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4919 (127.5 = random).
Monte Carlo value for Pi is 3.143094057 (error 0.05 percent).
Serial correlation coefficient is 0.000065 (totally uncorrelated = 0.0).


Ent is/was interesting but it's also from 2008 so outdated.

If Ent only accepts single bytes, munging fifty bits together to fit the input will hide problems.  It's diluting the less-random into the more random.  Presumably that's why that article only tested the lowest order byte.
jase:
Can't comment on your 30524d6 as I don't know enough detail about how you rolled and what you recorded, though I will muse that d6 with a target of 5; if there were a lot of running successes on that then that'd be picked up by the analyser's duplicate and walking analysis.

I was rolling sets of twenty, counting 5&6 as hits.  I threw that all into google sheets and started with calculating how many standard deviations off the total hits in a sliding interval was.  First 100 rolls wide with a step of 20, then 200 rolls, then 300 rolls, then 400 rolls.  Unsurprisingly, there were fewer substantial deviations on a window of 400 than 100.  Regression to the mean still works as expected.

Then I started playing around with the results of those standard deviation calculations, I set up the sheet to plot various moments, but struck gold immediately with Kurtosis.  That's the plot where things got interesting.  For the 100 die sliding interval (Step is still 20) it looks like noise, but that's expected.  The 200 die interval looks a bit more rhythmic, enough to be a bit suspect, but not a smoking gun.  The 300 die interval is the one I posted a link to fairly early in this thread, where it shows enormous spikes at an interval that's more regular than it should be.

Looking back at that sheet, I never did plot the Kurtosis for the 400 die interval, let me just do that real quick...  And it has even larger spikes with less noise in between.  Here's the plot.
https://i.postimg.cc/KvXFKFcZ/Kurtosis400-Window.png

jase:


Edit to add I downloaded the current binary file and analysed it.  Had to analyse it as raw data so 0-99 (not 1-100).
quote:
rolls         : 8,262,100
mean          : 49.5086358189806
median        : 50
variance      : 833.419114541513
trimmed_mean  : 49.5078691858008
harmonic_mean : (failed)
geometric_mean: 0
chi squared   : (failed)
kurtosis      : -1.20060378739476
skewness      : -0.00008133670175

Some tests failed including chi square, the data was just too large for it.

https://metacpan.org/pod/Statistics::Descriptive was used as well as https://metacpan.org/dist/Stat...Statistics/Basic.pod as well as https://metacpan.org/pod/Statistics::ChiSquare (for all the good that did).

I'm not even sure what some of those mean but that's all the gory stats.

Again, you're analyzing eight million rolls as a single lump.  In the long run the roller does regress to the mean.  It doesn't have significantly more or less outliers than expected across eight million rolls.  The analysis I did was about whether those outliers are clumped.

Try taking that raw data file and looping over it, analysing a sliding interval.  0-400, 20-420, 40-440, etc...  And then graph the kurtosis.
jase
admin, 3840 posts
Cogito, ergo procuro.
Carpe stultus!
Sun 5 Nov 2023
at 03:57

Re: Dice roller has a problem.

I think we're straying into "is it good enough for cryptography?" territory here and it's well established it's not.  If all the "basic" / standard tests are coming up good and we're having to resort to kurtosis to try and find an issue then is that really something we care about for a simulated die roller?  Something that emulates something truly biased - real die?

An Excel graph of 400,000 data points is just a blob of blue, there's too many data points for a line graph to actually look like a line.  But to humour us all I ran the roll data through the same math module and the absolute max kurtosis was 1.38.  Changing it to 20/200 gave 1.5 but I think that's getting too small a selection set, even though the results are still good.  Being ridiculous and changing it to 1/100 gives 1.62 as well as taking an equally ridiculous amount of time.

Rathmun:
Again, you're analyzing eight million rolls as a single lump.

Rathmun:
the data isn't perfectly clean because other people were using the roller at the same time I was

Everything I'm doing is going to be analysing what the roller is returning in consecutive rolls and you glossed over it earlier when you faced the same problem getting your rolls -- players are not going to get consecutive rolls (average die roll is going to be what, about 3 dice?) so most streaks are not going to permeate down to player rolls.

Rathmun:
I was rolling sets of twenty, counting 5&6 as hits

So effectively 1d3 counting 3s.

It's not perfect but crunching the log down into rolls of 1-3 (which introduced a fair bit of bias due to rounding) still gave a max kurtosis of 1.67 (20/400 again).

Additionally I got Perl to generate another 8 million rolls for us and checked the kurtosis (again, step/grouping of 20/400) and it came out to max of 1.42.  Switching to https://metacpan.org/pod/Math::Random::Secure also gave 1.42.

That was with d100 again, switching to d3 increased the kurtosis for both ging 1.70 and 1.74, respectively.  I think that demonstrates the increased chance for any particular result when using very small die (and we can only go one smaller w/ a coin).

If you don't think Math::Random::Secure is good enough then you've got problems with far more than just the dice roller.

Rathmun:
If Ent only accepts single bytes, munging fifty bits together to fit the input will hide problems.  It's diluting the less-random into the more random.  Presumably that's why that article only tested the lowest order byte.

That's not what I said.  Previously I got Perl to generate a data file that was directly suited for Ent and ran the analysis over it.  Basically rolled 0-255 and stored chr() of the result in the binary file.

Currently our roller log saves each roll as a 0-99 roll in the binary file.  Our page then uses these results to calculate the results and it's this same file I've been running analysis over.  However I cannot use this particular file for Ent as it's a binary file of characters only ranging 0-99 whereas Ent expects 0-255.
Rathmun
member, 66 posts
Sun 5 Nov 2023
at 04:31
  • msg #49

Dice roller has a problem.

jase:
Something that emulates something truly biased - real die?

This is a good point, and it's particularly galling that the best counterargument I can make is "But I don't like it."

I'm being OCD about this, I know I'm being OCD about this, but I still want to know what the heck is going on. [PullingOwnHair.gif]

jase:
An Excel graph of 400,000 data points is just a blob of blue, there's too many data points for a line graph to actually look like a line.  But to humour us all I ran the roll data through the same math module and the absolute max kurtosis was 1.38.  Changing it to 20/200 gave 1.5 but I think that's getting too small a selection set, even though the results are still good.  Being ridiculous and changing it to 1/100 gives 1.62 as well as taking an equally ridiculous amount of time.

The most telling kurtosis sweep I did was 20/400, where I got a max kurtosis of 2.9 and an average of 1.24.  This is wildly larger than your results when doing a similar calculation, and I'm wondering why.  I'm particularly curious why the kurtosis is spiking at regular intervals in my dataset.  I suppose it's possible that the spikes are an artifact of using the same PRNG as other people on the site.  If the actual rate of outliers streaks high and low at a slower pace, and people consume that randomness in surges  (Everyone in a given timezone going on lunch around the same time for example), then I might be seeing spikes due to the slow pace at which I was gathering my data set.  Effectively it just has chunks missing.


I'm completely fine with 1.42 kurtosis.  I'd be thrilled with a max of 1.42.  No problems with Math:Random:Secure.
Sign In