There was recently an article on "Generating truly random sequences" on hackaday: http://hackaday.com/2012/08/24/generati ... sequences/ which generated some contention as to whether it was truly random.

In actual fact, it's not truly random, just highly chaotic, take a look at Lorentz systems. In fact such a system would hardly produce much better random numbers than regular software-based pseudo-random number generators.

I thought I'd share some information about testing random number generators. I'm not a mathematician or statistician, so I'm approaching the challenge of testing random numbers from a layman's perspective. Here are some things that I learned along the way:

The most common random number generator (pseudorandom number generator) is called the Linear Congruential Generator, this generator is found EVERYWHERE including as the default rand() function in a lot of compilers and probably also on the Arduino (I'm not 100% certain of this, just speculation, can someone confirm?).

It has this basic form:

random_number = previous_random_number * constant_A + constant_C

The value is then truncated to however many bits you're using. The linked wikipedia page has some common values used for constant_A and constant_C, there are certain values for these constants that produce significantly poorer random numbers than others, so some care is needed to select them.

As you can see, the random number will follow the exact same sequence every time it is run, it is entirely predictable. To make things marginally more difficult for people to guess the sequence, the initial value for the random number can be set to a value that is not zero, this initial value is called the "random seed". But even this has problems, since for any given seed value the sequence will still be the same. To get around this the random seed is saved, so that the next time the program or system starts, the random seed will be different from the previous run.

Some of you gamers out there might recognise this - in some games with random outcomes to events such as opening a treasure chest, attacking an enemy unit, etc. you may have found that you were able to cheat by saving the game before taking action, and then reloading the game if you didn't like the outcome. This is made possible because the game's pseudorandom number generator would have been seeded differently the second time you performed the action.

However if you were playing on an emulator, and reloaded a save state, you may well find that the supposedly "random" actions had exactly the same outcome the second time you try it after reloading. This is because the save state captures the entire state of the emulated console, including the random seed.

Of course there are exceptions, some games like Civilization titles preserve the random seed in game saves to avoid people exploiting this cheap trick to get the best outcome of an attack or hut (especially useful in PBEM games). On the other hand some console games use additional information for the random number generator such as the exact timings of your button presses, which puts it closer to a true random number generator

Going back to the LCG, if you didn't know the constants A and C, it would difficult to guess what the next random number is. However if you collected a sample of random numbers, it is relatively easy to analyse the numbers and extract the constants.

So we've establised that LCG is actually quite a poor random number generator due to the ease with which the next number in the sequence can be guessed, but an even bigger problem to the ease with which the next number in the sequence can be guessed, the actual "randomnness" of the sequence isn't great either. There are various statistical tests for "randomness" but the simplest and easiest way of demonstrating the shortfalls of the LCG. Here is the output of an LCG using constants A= 16644525 and C= 32767 with 32-bit numbers, displayed as a bitmap image with each pixel representing one bit:

As you can clearly see, what should appear as pure noise, contains very obvious vertical black and white artefacts. This reveals fundamental problems with LCG.

Compare this with a more accepted "good" pseudorandom number generator, the Mersenne Twister

As you can see here, this output is indistinguishable from white noise to the human eye. Mersenne Twister is still a pseudorandom number generator; if you were able to figure out what the internal state of the generator was, you could plug those values into another machine and produce exactly the same sequence of "random" numbers.

Mersenne Twister is sometimes considered an ageing algorithm, it's quite a bit more complex than LCG and many argue that much simpler pseudorandom number generators can produce numbers of a similar quality as Mersenne Twister but at much lower processing requirements such as the Linear Feedback Shift Register family of generators

In any case visually the results are indistinguishable from a true random number generator such as this data created using random radio noise from a very simple circuit using a Forebrain dev board.

Many random sources exist that can be easily used in a home-built true random number generators, diode and amplifier noise are quite random since the noise originates from quantum effects, and are popular because circuits are easily built. Atmospheric (radio) noise is also popular since all that is needed is wire acting as an antenna, but care should be made to avoid accidentally tuning into a coherent radio signal. For microcontrollers and processors, the inaccuracy of internal oscillators that cause minute frequency drifts can be easily measured and exploited to produce random numbers, for example Intel CPUs include such hardware.

More exotic random sources can be used such as weather patterns, radioactive decay, webcams pointed at goldfish...

It's worth noting that true random number generators sometimes don't produce a uniform distribution of bits required for random number generators, for example, sampling diode noise with an ADC will produce values in a tight range around the diode's voltage with some noise overlaid on top. To get around this a Randomness Extrator is used, this can be as simple as using the true random number source to seed a pseudorandom number generator, the above data actually uses the LCG as a randomness extractor, and it can be seen that there are no artefacts in the image.

There are much more advanced statistical measurements for the randomness of a random number generator. I would recommend Robert G Brown's Dieharder, which is a suite of tests that can be used to test data for randomness, it also comes with a comprehensive set of pseudorandom number generators if you want to produce a lot of random numbers. Dieharder runs a battery of statistical tests on your input data to determine how likely it is that the data was produced by a true random number generator. Some pseudorandom number generators, despite passing the visual inspection test fail many of the dieharder tests.

Anyway, that's all I wanted to say, now you know a bit more about random number generation and one simple way of visually testing the data. Good luck.