Skip to content Skip to sidebar Skip to footer

Reshape Array On Xaxis And Fill With Mean Value In Python?

i'm trying to reshape a array in Python and fill it with mean values. Example: Given array: [2, 3, -20, 10, 4] Searched array: [2, 2.5, 3, -8.5, -20, -5, 10, 7, 4] More advanced

Solution 1:

It depends what you mean by well distributed values. Assuming your values lie on an evenly spaced grid the following solution using interpolation could make sense:

>>>import numpy as np>>>new_length = 9>>>b = np.interp(np.linspace(0,len(a)-1,new_length),range(len(a)),a)>>>b
array([  2. ,   2.5,   3. ,  -8.5, -20. ,  -5. ,  10. ,   7. ,   4. ])

This will also work if len(a)=1000 and new_length=1300.

Solution 2:

You can use a differentiation trick here with np.diff. Thus, assuming A as the input array, you can do -

out= np.empty(2*A.size-1)
out[0::2] = A
out[1::2] = (np.diff(A) +2*A[:-1]).astype(float)/2 # Interpolated values

The trick here is that the differentiation between two consecutive elements when added with twice of the previous element would be the mean value between those two elements. We just use this trick throughout the extent of the input 1D array to get our desired interpolated array.

Sample run -

In [34]: A
Out[34]: array([  2,   3, -20,  10,   4])

In [35]: out= np.empty(2*A.size-1)
    ...: out[0::2] = A
    ...: out[1::2] = (np.diff(A) +2*A[:-1]).astype(float)/2
    ...: 

In [36]: outOut[36]: array([  2. ,   2.5,   3. ,  -8.5, -20. ,  -5. ,  10. ,   7. ,   4. ])

I think @thomas's solution would be the go-to approach here as we are basically doing interpolation with a specific case in mind. But since, I am mostly interested in the performance of codes, here's a runtime test comparing these two solutions -

In [62]: definterp_based(A):   # @thomas's solution
    ...:    new_length = 2*A.size-1
    ...:    return np.interp(np.linspace(0,len(A)-1,new_length),range(len(A)),A)
    ...: 
    ...: defdiff_based(A): 
    ...:    out = np.empty(2*A.size-1)
    ...:    out[0::2] = A
    ...:    out[1::2] = (np.diff(A) + 2*A[:-1]).astype(float)/2
    ...:    return out
    ...: 

In [63]: A = np.random.randint(0,10000,(10000))

In [64]: %timeit interp_based(A)
1000 loops, best of 3: 932 µs per loop

In [65]: %timeit diff_based(A)
10000 loops, best of 3: 148 µs per loop

Solution 3:

I've written a solution which is even better for me. I had some problems with floating errors on large arrays. To correct those i inserted some missing ones randomly. Maybe someone knows how to avoid this I'm sure the code is very optimizable feel free to do this.

import numpy as np
defresizeArray(data, newLength):

    datalength = len(data)
    if (datalength == newLength): return data

    appendIndices = []
    appendNow = 0
    step = newLength / datalength
    increase =  step % 1for i in np.arange(0, datalength-2, step):
        appendNow += increase
        if appendNow >= 1:
            appendIndices.append(round(i,0))
            appendNow = appendNow % 1#still missing values due to floating errors?
    diff = newLength - datalength - len(appendIndices)
    if diff > 0:
        for i inrange(0, diff):
            appendIndices.append(np.random.randint(1, datalength - 2))

    #insert average at the specified indizes
    appendVals = [(data[i] + data[i+1]) / 2for i in appendIndices]
    a = np.insert(data, appendIndices, appendVals)

    return a

Post a Comment for "Reshape Array On Xaxis And Fill With Mean Value In Python?"