User:Nullplan/Random Number Generator

From OSDev Wiki
Jump to navigation Jump to search

A random number generator is a function that returns an erratic sequence of numbers. The goal is to simulate random events, such as a dice roll, for various purposes. Ideally, each time the function is called, each possible output is equally likely to be returned, but this does not always happen. Of interest in this community are particularly in-kernel RNGs.

Entropy

Computers are deterministic devices. If the program is the same and all inputs are the same, the same result will be calculated every time. How, then, can a computer generate random numbers? The computer cannot, but the physical world around it can. Many physical events are random to some degree, or more technically, have some degree of entropy. A sound recording will contain some level of noise even in the best shielded recording rooms on the planet. Humans type on keyboards with an inconsistent timing, and type quasi-random text like the article you are currently reading. Even other computers can provide entropy: While most network traffic is entirely machine generated, the exact relative timing of network packets in the end derives from the moment the power button was pressed, another human action. Therefore the timing and nature of input events can provide us with some entropy.

The state of a random number generator can be seen as an entropy buffer. Calculations based on the entropy buffer will have as many different outputs as the entropy buffer has different states. To deal with the combinatoric explosion, entropy is typically measured in bits. If one bit of entropy has been gathered, the entropy buffer can only be in one of two states. An entropy buffer is said to be full if as much entropy has been gathered as the buffer is large. At that point, any state of the buffer is possible, and therefore any calculation based on the buffer will have any possible result.

Sources of entropy

Generally the timing and nature of all input device events can be sources of entropy. As mentioned already, this includes keyboards, mice, and gamepads, but also network cards and hard disks. The reason for the latter is the behavior of air molecules on the surface of the platter, which cause random delays to hard disk commands. However, there is also dedicated hardware to generate entropy, such as avalanche diodes. And x86 contains a dedicated instruction to read a random number from an unspecified source.

Timing

When using timing as entropy source, the timestamp read should be as precise as possible. The TSC works well for this. Gauging the entropy gained from that operation requires knowledge of the timing window for the event to occur in and the tick rate of the TSC. For example, if a TSC has a tick rate of 3 GHz and an event has a 10ms window to occur, then the TSC read can have any one of 30 million values, which means the entropy gained from this is ca. 24.8 bits. Were the TSC slower, only 1 GHz, then the entropy would only be ca. 23.2 bits.

x86 randomness instructions

There is an RDRAND instruction in the x86 instruction set. Support for it is indicated by ECX bit 30 in CPUID function 1, subfunction 0. So support might be tested like this:

  xor eax, eax
  xor ecx, ecx
  inc eax
  cpuid
  bt ecx, 30 ; carry flag now indicates support for RDRAND

RDRAND reads a random number into a 16-, 32-, or 64-bit register. It might not be successful, however, and sets the carry flag to indicate success. Since no guarantees exist how quickly a new random number might become available, it is necessary to cap the number of retries, such as like this:

  mov ecx, 100
.retry:
  rdrand eax
  jc .success
  loop .retry
.failure:
  ; handle failed attempt
.success:
  ; handle successful read

A good use of this instruction might be to periodically or otherwise regularly read a random number and add to the entropy pool.

There is also an RDSEED instruction. Support is indicated by EBX bit 18 from CPUID function 7, subfunction 0. It has the same interface as RDRAND, but internally uses a different random number generator, that is supposed to make exhaustion less likely. Still, RDSEED can fail just like RDRAND. See the Intel documentation for the precise differences and assurances made.

Adversarial entropy

If an adversary can somehow observe the state of the entropy pool and contribute their own entropy into the pool, then it is possible that they would provide entropy in such a way as to force the entropy pool into a lower entropy state. A simple example would be an entropy source periodically checking if a certain bit in the pool is set, and then providing entropy until it is clear. This way, for most of the time, that given bit is clear, and the number of possible states the entropy buffer can be in is halved. A more sophisticated attack is described on DJB's blog: https://blog.cr.yp.to/20140205-entropy.html

Low-entropy PRNGs

For non-cryptographic applications, it is generally enough for a random number generator to simply produce a different sequence every time, and have it be reasonably erratic. Cryptographic security (see next section) is not required. For these, it is generally not necessary to gather entropy continuously from various sources; it is enough to seed them once from a value that is different each time the program is run (e.g. system time). The advantage of these algorithms is that they are fast and simple, and do not require kernel support. Therefore they are often implemented in language runtime libraries. The drawback is, of course, that they are not cryptographically secure. Another disadvantage is that their state is not shared across the entire system. Therefore it is possible for two different processes to accidentally happen upon the same sequence and generate the exact same sequence in lock-step.

Linear-feedback shift register

One of the earliest methods for calculating erratic sequences, a linear-feedback shift register consists of a shift register with a feedback (a line leading from the shift-out bit to the input bit) and some linear function being applied to it. In Fibonacci configuration, it is the input bit that has a linear function applied to it, while in Galois configuration, the output bit is fed directly back into the input, but triggers some XOR gates in line with the shift registers. The output of the function is a single bit, so in order to generate more, multiple calls are required. Simple example:

static uint32_t seed = 0xcafebabe; /* anything nonzero will do */
void set_seed(uint32_t s) {
  if (!s)
    seed = 0xcafebabe;
  else
    seed = s;
}

int fibonacci_lfsr(void)
{
  /* taps in positions 32, 31, 28 */
  uint32_t bitIn = (seed ^ (seed >> 1) ^ (seed >> 4)) & 1;
  int bitOut = seed & 1;
  seed = (seed >> 1) | (bitIn << 31);
  return bitOut;
}
int galois_lfsr(void)
{
  int bitOut = seed & 1;
  seed = (seed >> 1) ^ (seed & 1? 0xc8000000ul : 0);
  return bitOut;
}

Linear congruential generator

A very popular choice, the linear congruential generator has been the mainstay of C runtime libraries everywhere. For example, at the time of this writing, musl, klibc, glibc, pdclib, and uclibc all use an LCG to implement the standard rand() function by default. In fact, several use the exact same one, for example given by pdclib:

int rand( void )
{
    _PDCLIB_seed = _PDCLIB_seed * 1103515245 + 12345;
    return ( int )( _PDCLIB_seed / 65536 ) % 32768;
}

Mathematically, the seed is multiplied with a constant, then another constant is added (this would be the linear part), and the whole thing is evaluated in some remainder class ring (that would be the congruential part). The modulus of that ring is generally a power of two, and generally one more than the maximum value the type of the seed can hold. This is because that ring is implicit.

Notice how both constants mentioned are odd numbers. This is required for maximum period length. However, since both are odd, a peculiar thing is happening: If the seed on input is odd, on output it will be even, and if it is even, on output it will be odd. Therefore its lowest bit is not random at all, it will always simply toggle. This is one of the well-known properties of LCGs, that the low bits are significantly less random than the high bits. It is therefore understandable that most libraries will modify the output to only return the high bits of the state.

Wichmann-Hill

In 1982, Brian Wichmann and David Hill proposed to combine three linear congruential generators, then normalizing and summing the results to get a number uniformly distributed between 0 and 1. A common instance is:

static uint16_t seed[3]; /* seed with numbers between 1 and 30000 */
float wichmann_hill(void) {
  seed[0] = (171 * seed[0]) % 30269;
  seed[1] = (172 * seed[1]) % 30307;
  seed[2] = (170 * seed[2]) % 30323;
  return fmod(seed[0] / 30269.0 + seed[1] / 30307.0 + seed[2] / 30323.0, 1.0);
}

This version is known to have a period of just shy of of seven trillion (the least common multiple of 30268, 30306, and 30322).

Mersenne Twister

The Mersenne Twister uses a generalized feedback shift register and a huge amount of state to provide a pseudo-random function with an utterly giant period. The name comes from the fact that that period happens to be a Mersenne prime. Implementation according to Wikipedia:

#include <stddef.h>
#include <stdint.h>
#include <assert.h>

#ifdef USE32
typedef uint32_t word_t;
#define STATE_SIZE  624
#define MIDDLE      397
#define INIT_SHIFT  30
#define INIT_FACT   1812433253
#define TWIST_MASK  0x9908b0df
#define SHIFT1      11
#define MASK1       0xffffffff
#define SHIFT2      7
#define MASK2       0x9d2c5680
#define SHIFT3      15
#define MASK3       0xefc60000
#define SHIFT4      18
#else
typedef uint64_t word_t;
#define STATE_SIZE  312
#define MIDDLE      156
#define INIT_SHIFT  62
#define TWIST_MASK  0xb5026f5aa96619e9
#define INIT_FACT   6364136223846793005
#define SHIFT1      29
#define MASK1       0x5555555555555555
#define SHIFT2      17
#define MASK2       0x71d67fffeda60000
#define SHIFT3      37
#define MASK3       0xfff7eee000000000
#define SHIFT4      43
#endif

#define LOWER_MASK  0x7fffffff
#define UPPER_MASK  (~(word_t)LOWER_MASK)
static word_t state[STATE_SIZE];
static size_t index = STATE_SIZE + 1;

static void seed(word_t s)
{
    index = STATE_SIZE;
    state[0] = s;
    for (size_t i = 1; i < STATE_SIZE; i++)
        state[i] = (INIT_FACT * (state[i - 1] ^ (state[i - 1] >> INIT_SHIFT))) + i;
}

static void twist(void)
{
    for (size_t i = 0; i < STATE_SIZE; i++)
    {
        word_t x = (state[i] & UPPER_MASK) | (state[(i + 1) % STATE_SIZE] & LOWER_MASK);
        x = (x >> 1) ^ (x & 1? TWIST_MASK : 0);
        state[i] = state[(i + MIDDLE) % STATE_SIZE] ^ x;
    }
    index = 0;
}

static word_t mt_random(void)
{
    if (index >= STATE_SIZE)
    {
        assert(index == STATE_SIZE || !"Generator never seeded");
        twist();
    }

    word_t y = state[index];
    y ^= (y >> SHIFT1) & MASK1;
    y ^= (y << SHIFT2) & MASK2;
    y ^= (y << SHIFT3) & MASK3;
    y ^= y >> SHIFT4;

    index++;
    return y;
}

Cryptographic security

In order to understand the following section, we first should review what cryptographic security means with respect to random number generators. Cryptographers distinguish two types of security: Information-theoretic security and computational security. The former of these is the stronger claim. If a cryptosystem is information-theoretically secure, an attacker cannot say anything about an encrypted message without owning the key. Such cryptosystems are rare. Since the attacker must not be able to say anything, everything including the length must be hidden. How can one even hide the length of a message? The only choices are to not send the message at all, which defeats the purpose of a cryptosystem, or to send an infinitely long message. So once you start sending, you can never stop. If you ever stop sending, the attacker has at least an upper bound on the length of the information. In practice, you can probably stop when you die, since then it will no longer matter, but such a system still is not an idea for the future. And the length can matter a lot. Imagine a secret service looking for a video file. And they intercept a 64 byte encrypted message. Is that message the video they are looking for?

Because information-theoretic security is impractical, and usually too strong a claim for the needs of the users, most cryptosystems only deliver computational security. Certain data about the message is revealed to everyone: The length to some granularity, and often the metadata (when was it sent? Who sends it to whom?). And the remaining data can in theory be recovered, but it is computationally infeasible to do so. For an RNG, cryptographic security means that no attacker being able to observe the output of the function before or after the victim called it shall be able to say anything about the data the victim got from their calls. This requirement can be fulfilled by real-world randomness generators, but also by cryptographically secure pseudo-RNGs that were seeded with entropic data.

High-entropy PRNGs

Notice: The code in this section is illustrative. Actual implementations should not use it.

Imagine, if you will, a fixed-size entropy buffer. It is initialized to some value. Now some entropic data comes along. How can this entropy be captured in the entropy buffer without destroying the entropy already in there, especially without knowing how entropic the data really is? A simple idea would be to append the new data to the entropy buffer, but that would lead to an endlessly growing entropy buffer. But one of the properties of cryptographically secure hash functions is compression, right? So we could just hash the concatenation of the entropy buffer and the new data. Or, somewhat more formally:

static unsigned char entropy_pool[SHA_1_HASH_SIZE];

void mix_in_entropy(const void *data, size_t len)
{
  SHA1_context ctx;
  sha1_init(&ctx);
  sha1_add_data(&ctx, entropy_pool, sizeof entropy_pool);
  sha1_add_data(&ctx, data, len);
  sha1_finalize(&ctx, entropy_pool);
}

This fulfills the goal of having a fixed size buffer. The hashing process also distributes the entropy over the entire buffer, which is a desirable trait. In practice, cryptographic security is not required at this stage, and e.g. Linux uses a non-secure hash function for this purpose.

Now we have a constant sized buffer, with contents growing more and more unpredictable as entropy is added. But how do we actually extract random numbers from it? If we just returned the entropy buffer, we would be giving away the crown jewels. The entropy buffer must be kept secret, because knowing it can massively narrow down the possible random numbers any future consumer is getting. Which compromises cryptographic security.

However, we can hash the entropy pool. The hash of the entropy pool is as random as the entropy pool itself and allows no information about the contents of the pool itself. That is a good idea, but has multiple serious flaws: If the output becomes larger than the size of the used hash function, it will start to repeat. Also, if there are multiple calls in close succession, they will all get the same "random" data. Both of these issues can be remedied by mixing the output back into the entropy pool:

void get_entropy(void *data, size_t len)
{
  SHA1_context ctx;
  unsigned char buf[SHA_1_HASH_SIZE];
  while (len)
  {
    sha1_init(&ctx);
    sha1_add_data(&ctx, entropy_pool, sizeof entropy_pool);
    sha1_finalize(&ctx, buf);
    mix_in_entropy(buf, sizeof buf);
    size_t cpylen = min(len, sizeof buf);
    memcpy(data, buf, cpylen);
    data = (char*)data + cpylen;
    len -= cpylen;
  }
}

It is at this point that the distinction between pseudo-RNG and "real" RNG really breaks down. Here we have a pseudo-RNG being constantly reseeded with real-world entropic data. It returns a nearly endless sequence of random data and will only repeat if more bits can be read than the entropy buffer has available states before more external entropy is mixed in. It can even be seeded from a "real" RNG. So what is it?

Of course this implementation is not perfect. It lacks any kind of synchronization, for one thing. Using it in a multi-threaded environment as is would introduce data races, which are undefined behavior. And it is not a simple topic, since you would want mix_in_entropy() to be safe for interrupt handlers in most cases, so a simple big mutex will not necessarily do the trick. Also, the implementer of these lines is not a cryptographer, so I am certain I made some rookie mistakes. Cryptography can be counter-intuitive at times.


Further Reading

  • [1] - Further information about the two randomness devices in Linux, and general cryptography advice.
  • [2] - An implementation of Fortuna, a CSPRNG.