Portability And Reproducibility Of RNG Techniques

May 03, 2023 Post a Comment

I can use one of two methods to create a pseudo random number sequence that has two important characteristics - (1) it is reproducible on different machines, and (2) the sequence n

Solution 1:

Most portable version, IMO, would be LCG with period equal to natural word size of the machine. It uses register overflow for module which makes it even faster. You have to use NumPy datatypes to do that, here is simple code, constants a, c are taken from Table 4 here

import numpy as np

def LCG(seed: np.uint64, a: np.uint64, c: np.uint64) -> np.uint64:
    with np.errstate(over='ignore'):
        while True:
            seed = (a * seed + c)
            yield seed

a = np.uint64(2862933555777941757)
c = np.uint64(1)

rng64 = LCG(np.uint64(17), a, c)

print(next(rng64))
print(next(rng64))
print(next(rng64))

Both Linux x64 and Windows x64 as well as OS X VM works exactly the same.

Concerning reproducibility, the only good is to store first couple numbers and compare them with LCG output during app initialization stage - if they are ok, you proceed further.

Another feature of the LCG I like is it's ability to jump ahead in log₂(N) time, where N is number of skips. I could provide you with code to do that. Using jump ahead you could ensure non-overlapping sequences for parallel independent random streams

UPDATE

Here is translation of my C code into Python/NumPy, seems to work. It could skip forward as well as backward in logarithmic time.

import numpy as np

class LCG(object):

    UZERO: np.uint64 = np.uint64(0)
    UONE : np.uint64 = np.uint64(1)

    def __init__(self, seed: np.uint64, a: np.uint64, c: np.uint64) -> None:
        self._seed: np.uint64 = np.uint64(seed)
        self._a   : np.uint64 = np.uint64(a)
        self._c   : np.uint64 = np.uint64(c)

    def next(self) -> np.uint64:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint64:
        return self._seed

    def set_seed(self, seed: np.uint64) -> np.uint64:
        self._seed = seed

    def skip(self, ns: np.int64) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint64 = np.uint64(ns)

        a: np.uint64 = self._a
        c: np.uint64 = self._c

        a_next: np.uint64 = LCG.UONE
        c_next: np.uint64 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next    


np.seterr(over='ignore')

a = np.uint64(2862933555777941757)
c = np.uint64(1)
seed = np.uint64(1)

rng64 = LCG(seed, a, c) # initialization

print(rng64.next())
print(rng64.next())
print(rng64.next())

rng64.skip(-3) # back by 3
print(rng64.next())
print(rng64.next())
print(rng64.next())

rng64.skip(-3) # back by 3
rng64.skip(2) # forward by 2
print(rng64.next())

Anyway, summary of the LCG RNG:

With good constants (see reference to L'Ecuyer paper) it will cover whole [0...2) range without repeating itself. Basically perfect [0...2) -> [0...2) mapping, you could set 0,1,2,3,... as input and get whole range output
It is reversible, you could get previous seed back so mapping is actually bijection, [0...2) <-> [0...2). See Reversible pseudo-random sequence generator for details
It has logarithmic skip forward and backward, so there is no problem to find suitable intervals for parallel computation - start with single seed, and then next thread would be skip(seed, 2/N), next thread skip(seed, 2/N * 2) and so on and so forth. Guaranteed to not overlap
It is simple and fast, though not a very high quality RNG

Solution 2:

LCGs are nice. If you want to make LCGs more understandable, you may implement it recursively instead of iteratively to highlight the recursive formula it's based upon. Do it only if you don't worry too much about complexity.

Otherwise, I think method 2 is clear enough for a PRNG.

Solution 3:

There are many ways an algorithm (particularly a pseudorandom number generator) can have inconsistent results across computers. The most notable way is if the algorithm relies on floating-point numbers (which almost always are limited precision) and floating-point rounding.

Some programming languages are more prone to portability issues than others. For example, unlike in Python, inconsistent results can arise in C or C++ due to—

Undefined behavior of signed integer overflows, or
how the lengths of certain data types, notably int and long, are defined differently across compilers.

I am not aware of any way the Python code in method 2 can deliver inconsistent results across computers. On the other hand, whether method 1 can do so depends on whether random.sample is implemented the same way across all Python versions you care about and across all computers.

Getting Started with Python

Portability And Reproducibility Of RNG Techniques

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "Portability And Reproducibility Of RNG Techniques"