Skip to content Skip to sidebar Skip to footer

How To Un-shuffle Data?

it may exist a method to coming back from the function shuffle from sklearn.utils? I explain better my problem: I use the shuffle function to randomize the rows of two matrices: A

Solution 1:

It will not necessarily be possible, depending on your choice of f. If f is invertible, and you keep track of the manner in which the rows were shuffled, it will be possible, if not efficient. The sklearn.utils shuffle method does NOT "keep track" of the manner in which the matrix was shuffled. You may want to roll your own. To generate a random shuffle, generate a random permutation of range(len(A)), then iteratively swap the rows in that order. To retrieve the original matrices, you can just reverse the permutation. This would allow you to recover C for certain choices of f (e.g. matrix addition)

(EDIT, OP requested additional info)

This works for me, but there's probably a more efficient way to do it:

import numpy as np

defshuffle(A,axis=0,permutation=None):
    A = np.swapaxes(A,0,axis)
    if permutation isNone:
        permutation = np.random.permutation(len(A))
    temp = np.copy(A[permutation[0]])
    for i inrange(len(A)-1):
        A[permutation[i]] = A[permutation[i+1]]
    A[permutation[-1]] = temp
    A = np.swapaxes(A,0,axis)
    return A, permutation

A = np.array([[1,2],[3,4],[5,6],[7,8]])
print A
B, p = shuffle(A) #NOTE: shuffle is in place, so A is the same object as B!!!!print"shuffle A"print B
D, _ = shuffle(B,permutation=p[::-1])
print"unshuffle B to get A"print D

B = np.copy(B)
C = A+B
print"A+B"print C

A_s, p = shuffle(A)
B_s, _ = shuffle(B, permutation = p)
C_s = A_s + B_s

print"shuffle A and B, then add"print C_s

print"unshuffle that to get the original sum"
CC, _ = shuffle(C_s, permutation=p[::-1])
print CC

Post a Comment for "How To Un-shuffle Data?"