Random number generation

Description of your first forum.

Random number generation

Postby UAirLtd » August 24th, 2012, 9:08 pm

There was recently an article on "Generating truly random sequences" on hackaday: http://hackaday.com/2012/08/24/generati ... sequences/ which generated some contention as to whether it was truly random.

In actual fact, it's not truly random, just highly chaotic, take a look at Lorentz systems. In fact such a system would hardly produce much better random numbers than regular software-based pseudo-random number generators.

I thought I'd share some information about testing random number generators. I'm not a mathematician or statistician, so I'm approaching the challenge of testing random numbers from a layman's perspective. Here are some things that I learned along the way:

The most common random number generator (pseudorandom number generator) is called the Linear Congruential Generator, this generator is found EVERYWHERE including as the default rand() function in a lot of compilers and probably also on the Arduino (I'm not 100% certain of this, just speculation, can someone confirm?).

It has this basic form:
random_number = previous_random_number * constant_A + constant_C

The value is then truncated to however many bits you're using. The linked wikipedia page has some common values used for constant_A and constant_C, there are certain values for these constants that produce significantly poorer random numbers than others, so some care is needed to select them.

As you can see, the random number will follow the exact same sequence every time it is run, it is entirely predictable. To make things marginally more difficult for people to guess the sequence, the initial value for the random number can be set to a value that is not zero, this initial value is called the "random seed". But even this has problems, since for any given seed value the sequence will still be the same. To get around this the random seed is saved, so that the next time the program or system starts, the random seed will be different from the previous run.

Some of you gamers out there might recognise this - in some games with random outcomes to events such as opening a treasure chest, attacking an enemy unit, etc. you may have found that you were able to cheat by saving the game before taking action, and then reloading the game if you didn't like the outcome. This is made possible because the game's pseudorandom number generator would have been seeded differently the second time you performed the action.

However if you were playing on an emulator, and reloaded a save state, you may well find that the supposedly "random" actions had exactly the same outcome the second time you try it after reloading. This is because the save state captures the entire state of the emulated console, including the random seed.

Of course there are exceptions, some games like Civilization titles preserve the random seed in game saves to avoid people exploiting this cheap trick to get the best outcome of an attack or hut (especially useful in PBEM games). On the other hand some console games use additional information for the random number generator such as the exact timings of your button presses, which puts it closer to a true random number generator

Going back to the LCG, if you didn't know the constants A and C, it would difficult to guess what the next random number is. However if you collected a sample of random numbers, it is relatively easy to analyse the numbers and extract the constants.

So we've establised that LCG is actually quite a poor random number generator due to the ease with which the next number in the sequence can be guessed, but an even bigger problem to the ease with which the next number in the sequence can be guessed, the actual "randomnness" of the sequence isn't great either. There are various statistical tests for "randomness" but the simplest and easiest way of demonstrating the shortfalls of the LCG. Here is the output of an LCG using constants A= 16644525 and C= 32767 with 32-bit numbers, displayed as a bitmap image with each pixel representing one bit:

Image

As you can clearly see, what should appear as pure noise, contains very obvious vertical black and white artefacts. This reveals fundamental problems with LCG.

Compare this with a more accepted "good" pseudorandom number generator, the Mersenne Twister

Image

As you can see here, this output is indistinguishable from white noise to the human eye. Mersenne Twister is still a pseudorandom number generator; if you were able to figure out what the internal state of the generator was, you could plug those values into another machine and produce exactly the same sequence of "random" numbers.

Mersenne Twister is sometimes considered an ageing algorithm, it's quite a bit more complex than LCG and many argue that much simpler pseudorandom number generators can produce numbers of a similar quality as Mersenne Twister but at much lower processing requirements such as the Linear Feedback Shift Register family of generators

In any case visually the results are indistinguishable from a true random number generator such as this data created using random radio noise from a very simple circuit using a Forebrain dev board.

Image

Many random sources exist that can be easily used in a home-built true random number generators, diode and amplifier noise are quite random since the noise originates from quantum effects, and are popular because circuits are easily built. Atmospheric (radio) noise is also popular since all that is needed is wire acting as an antenna, but care should be made to avoid accidentally tuning into a coherent radio signal. For microcontrollers and processors, the inaccuracy of internal oscillators that cause minute frequency drifts can be easily measured and exploited to produce random numbers, for example Intel CPUs include such hardware.

More exotic random sources can be used such as weather patterns, radioactive decay, webcams pointed at goldfish...

It's worth noting that true random number generators sometimes don't produce a uniform distribution of bits required for random number generators, for example, sampling diode noise with an ADC will produce values in a tight range around the diode's voltage with some noise overlaid on top. To get around this a Randomness Extrator is used, this can be as simple as using the true random number source to seed a pseudorandom number generator, the above data actually uses the LCG as a randomness extractor, and it can be seen that there are no artefacts in the image.

There are much more advanced statistical measurements for the randomness of a random number generator. I would recommend Robert G Brown's Dieharder, which is a suite of tests that can be used to test data for randomness, it also comes with a comprehensive set of pseudorandom number generators if you want to produce a lot of random numbers. Dieharder runs a battery of statistical tests on your input data to determine how likely it is that the data was produced by a true random number generator. Some pseudorandom number generators, despite passing the visual inspection test fail many of the dieharder tests.

Anyway, that's all I wanted to say, now you know a bit more about random number generation and one simple way of visually testing the data. Good luck.
User avatar
UAirLtd
 
Posts: 629
Joined: July 19th, 2011, 10:32 pm

Re: Random number generation

Postby semicolo » August 26th, 2012, 5:29 am

You win my award for the longest post I've ever read on a forum. Very instructive.
User avatar
semicolo
 
Posts: 268
Joined: December 13th, 2011, 3:32 pm

Re: Random number generation

Postby leadacid » August 26th, 2012, 9:10 am

Wow. Great article UAirLtd. Interesting, concise, and educational. Good Job!

I'll preface my talkin below by saying I watched a video about penetration testing the other day... A system is only as strong as its weakest link. :-)

Randomness is an interesting idea. While the engineer in me can certainly appreciate the goal for better "randomness", the practical systems analyst wonders about the USE of those random numbers. In your example of the random item generation for the treasure chest, as you said, a better seed would be an excellent solution. However, for something like cryptographic key generation, a more random generation would obviously be more advantageous.

However, does there come a point where things are "just random enough"?

I'm curious at what point achieving more "truly random" numbers becomes a point of diminishing returns for its intended application. What application would the webcam-at-a-lavalamp be an applicable seed generator?

Anyway, I think I've gone off my own rails here. Good article!
leadacid
 
Posts: 109
Joined: January 27th, 2011, 4:47 pm
Location: Wisconsin

Re: Random number generation

Postby UAirLtd » August 26th, 2012, 10:20 am

Yes, there definitely is a case for "just random enough", most things can get away with a simple PRNG, while others need closer to true randomness. I think the fact is that these days it's quite easy to build a good random number generator circuit based on diode/amplifier noise, such that a lot of the time instead of going for "just random enough", engineers go for "best performance that the budget allows".

I was building an 8x8 LED matrix driver board, and to test it I set it to continuously update a random pixel with a random color. You'd think that setting a RANDOM pixel with a RANDOM color was going to be random enough using any PRNG algorithm, but it turns out that using an LCG actually causes a regular pattern of apparently up-wards moving bright pixels on two of the columns of the matrix driver board, so instead of giving the appearance of random colours, there was an illusion of motion to the image. I switched to Mersenne Twister and that fixed the problem.

It may have been coincidence that the particular LCG seed and constants cause this, but the fact that such coincidences can happen means that clearly LCG is not random enough even for this simple application of displaying random colours.

I remember some time ago seeing a giant automatic dice-roller on Hackaday. Here it is: http://hackaday.com/2009/05/26/dice-o-matic/ from way back in 2009! Also http://gamesbyemail.com/News/DiceOMatic In this case the use was for online dice-based games, the author says "To generate the die rolls, I have used Math.random, Random.org and other sources, but have always received numerous complaints that the dice are not random enough. Some players have put more effort into statistical analysis of the rolls than they put into their doctoral dissertation. " so instead of disputing the statistical analysis, he builds a giant dice throwing machine. The beauty of this is that any random number generator could be disputed, but in the case of a machine that throws actual dice, even if the numbers were found to not be random enough, nobody can argue about it after seeing it in action because it's clear that the machine can be no less random than throwing the dice yourself.
User avatar
UAirLtd
 
Posts: 629
Joined: July 19th, 2011, 10:32 pm

Re: Random number generation

Postby t0m » October 23rd, 2012, 2:13 pm

Great post. The follow up on the usefulness of different degrees of randomness for different applications is also on the money.
t0m
 
Posts: 9
Joined: October 23rd, 2012, 1:46 pm


Return to General Talk

Who is online

Users browsing this forum: No registered users and 1 guest