Portability And Reproducibility Of RNG Techniques
Solution 1:
Most portable version, IMO, would be LCG with period equal to natural word size of the machine. It uses register overflow for module which makes it even faster. You have to use NumPy datatypes to do that, here is simple code, constants a, c are taken from Table 4 here
import numpy as np
def LCG(seed: np.uint64, a: np.uint64, c: np.uint64) -> np.uint64:
with np.errstate(over='ignore'):
while True:
seed = (a * seed + c)
yield seed
a = np.uint64(2862933555777941757)
c = np.uint64(1)
rng64 = LCG(np.uint64(17), a, c)
print(next(rng64))
print(next(rng64))
print(next(rng64))
Both Linux x64 and Windows x64 as well as OS X VM works exactly the same.
Concerning reproducibility, the only good is to store first couple numbers and compare them with LCG output during app initialization stage - if they are ok, you proceed further.
Another feature of the LCG I like is it's ability to jump ahead in log2(N) time, where N is number of skips. I could provide you with code to do that. Using jump ahead you could ensure non-overlapping sequences for parallel independent random streams
UPDATE
Here is translation of my C code into Python/NumPy, seems to work. It could skip forward as well as backward in logarithmic time.
import numpy as np
class LCG(object):
UZERO: np.uint64 = np.uint64(0)
UONE : np.uint64 = np.uint64(1)
def __init__(self, seed: np.uint64, a: np.uint64, c: np.uint64) -> None:
self._seed: np.uint64 = np.uint64(seed)
self._a : np.uint64 = np.uint64(a)
self._c : np.uint64 = np.uint64(c)
def next(self) -> np.uint64:
self._seed = self._a * self._seed + self._c
return self._seed
def seed(self) -> np.uint64:
return self._seed
def set_seed(self, seed: np.uint64) -> np.uint64:
self._seed = seed
def skip(self, ns: np.int64) -> None:
"""
Signed argument - skip forward as well as backward
The algorithm here to determine the parameters used to skip ahead is
described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
O(log2(N)) operations instead of O(N). It computes parameters
A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
"""
nskip: np.uint64 = np.uint64(ns)
a: np.uint64 = self._a
c: np.uint64 = self._c
a_next: np.uint64 = LCG.UONE
c_next: np.uint64 = LCG.UZERO
while nskip > LCG.UZERO:
if (nskip & LCG.UONE) != LCG.UZERO:
a_next = a_next * a
c_next = c_next * a + c
c = (a + LCG.UONE) * c
a = a * a
nskip = nskip >> LCG.UONE
self._seed = a_next * self._seed + c_next
np.seterr(over='ignore')
a = np.uint64(2862933555777941757)
c = np.uint64(1)
seed = np.uint64(1)
rng64 = LCG(seed, a, c) # initialization
print(rng64.next())
print(rng64.next())
print(rng64.next())
rng64.skip(-3) # back by 3
print(rng64.next())
print(rng64.next())
print(rng64.next())
rng64.skip(-3) # back by 3
rng64.skip(2) # forward by 2
print(rng64.next())
Anyway, summary of the LCG RNG:
- With good constants (see reference to L'Ecuyer paper) it will cover whole [0...2) range without repeating itself. Basically perfect [0...2) -> [0...2) mapping, you could set 0,1,2,3,... as input and get whole range output
- It is reversible, you could get previous seed back so mapping is actually bijection, [0...2) <-> [0...2). See Reversible pseudo-random sequence generator for details
- It has logarithmic skip forward and backward, so there is no problem to find suitable intervals for parallel computation - start with single seed, and then next thread would be skip(seed, 2/N), next thread skip(seed, 2/N * 2) and so on and so forth. Guaranteed to not overlap
- It is simple and fast, though not a very high quality RNG
Solution 2:
LCGs are nice. If you want to make LCGs more understandable, you may implement it recursively instead of iteratively to highlight the recursive formula it's based upon. Do it only if you don't worry too much about complexity.
Otherwise, I think method 2 is clear enough for a PRNG.
Solution 3:
There are many ways an algorithm (particularly a pseudorandom number generator) can have inconsistent results across computers. The most notable way is if the algorithm relies on floating-point numbers (which almost always are limited precision) and floating-point rounding.
Some programming languages are more prone to portability issues than others. For example, unlike in Python, inconsistent results can arise in C or C++ due to—
- Undefined behavior of signed integer overflows, or
- how the lengths of certain data types, notably
int
andlong
, are defined differently across compilers.
I am not aware of any way the Python code in method 2 can deliver inconsistent results across computers. On the other hand, whether method 1 can do so depends on whether random.sample
is implemented the same way across all Python versions you care about and across all computers.
Post a Comment for "Portability And Reproducibility Of RNG Techniques"