Tensorflow: Bincount With Axis Option

June 13, 2023 Post a Comment

In TensorFlow, I can get the count of each element in an array with tf.bincount: x = tf.placeholder(tf.int32, [None]) freq = tf.bincount(x) tf.Session().run(freq, feed_dict = {x:[2

Solution 1:

A simple way I found to do this is to take advantage of broadcasting to compare all values in the tensor against the pattern [0, 1, ..., length - 1], and then count the number of "hits" along the desired axis.

Namely:

defbincount(arr, length, axis=-1):
  """Count the number of ocurrences of each value along an axis."""
  mask = tf.equal(arr[..., tf.newaxis], tf.range(length))
  return tf.math.count_nonzero(mask, axis=axis - 1if axis < 0else axis)

x = tf.convert_to_tensor([[2,3,1,3,7],[1,1,2,2,3]])
bincount(x, tf.reduce_max(x) + 1, axis=1)

returns:

<tf.Tensor: id=406, shape=(2, 8), dtype=int64, numpy=
array([[0, 1, 1, 2, 0, 0, 0, 1],
       [0, 2, 2, 1, 0, 0, 0, 0]])>

Solution 2:

A solution for this is given for numpy array:Apply bincount to each row of a 2D numpy array. Make every row unique by adding row_id * (max + 1) to each row, and then find bincount for the flattened 1d-array and then reshaping it appropriately.

For TF make the following changes:

x = tf.placeholder(tf.int32, [None, None])

max_x_plus_1 = tf.reduce_max(x)+1
ids = x + max_x_plus_1*tf.range(tf.shape(x)[0])[:,None]
out = tf.reshape(tf.bincount(tf.layers.flatten(ids), 
                 minlength=max_x_plus_1*tf.shape(x)[0]), [-1, N])

tf.Session().run(out, feed_dict = {x:[[2,3,1,3,7],[1,1,2,2,3]]})
#[[0, 1, 1, 2, 0, 0, 0, 1],
#[0, 2, 2, 1, 0, 0, 0, 0]]

Solution 3:

I needed this myself and wrote a little function for it since there isn't a official implementation.

defbincount(tensor, minlength=None, axis=None):
    if axis isNone:
        return tf.bincount(tensor, minlength=minlength)
    else:
        ifnothasattr(axis, "__len__"):
            axis = [axis]

        other_axis = [x for x inrange(0, len(tensor.shape)) if x notin axis]
        swap = tf.transpose(tensor, [*other_axis, *axis])
        flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
        count = tf.map_fn(lambda x: tf.bincount(x, minlength=minlength), flat)
        res = tf.reshape(count, [*np.take([-1if a isNoneelse a for a in tensor.shape.as_list()], other_axis), minlength])
        return res

In there is a lot of handling for different edge cases.

The gist of this solution is the following part:

swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.bincount(x, minlength=minlength), flat)

The transpose operation moves all axis that you want to bincount to the end of the tensor. For example if you would have a matrix that looks like [100, 50, 20] with axis [0, 1, 2] and you would like the bincount for axis 1, this operation would swap axis 1 to the end and you would get a [100, 20, 50] matrix.
The reshape operation flattens all other axis for which you don't want a bincount to a single dimension / axis.
The map_fn operation maps a bincount onto every entry of the flattened dimension / axis.

You have to specify the minlength parameter. This is needed so all bincount results have the same length (or else the matrix wouldn't have a valid shape). This is probably the max value for your tensor. For me it was better to pass it as parameter since I already had this value and didn't need to retrieve it but you could also calculate it with tf.reduce_max(tensor).

The full solution additionally reshapes it back to restore the other axes. It also supports multiple axes and a single None axis in the tensor (for batching).

Solution 4:

tf.bincount() accepts an array as argument but it aggregates the count across the array and doesn't work along some axis, at the moment. For example:

In [27]: arr
Out[27]: 
array([[2, 3, 1, 3, 7],
       [1, 1, 2, 2, 3]], dtype=int32)

In [28]: x = tf.placeholder(tf.int32, [None, None])
    ...: freq = tf.bincount(x)
    ...: tf.Session().run(freq, feed_dict = {x:arr})

# aggregates the count across the whole array
Out[28]: array([0, 3, 3, 3, 0, 0, 0, 1], dtype=int32)
# 0 occurs 0 times
# 1 occurs 3 times
# 2 occurs 3 times
# 3 occurs 3 times and so on..

So, at least as of now, there no way to pass the axis information to tf.bincount().

However, a slightly inefficient way would be to pass one row at a time to tf.bincount() and get the results. And then finally combine these resultant 1D arrays as an array of desired dimensionality.

I'm unsure whether this is the most efficient way but anyway here is one way to loop over the tensor (along axis 0)

In [3]: arr = np.array([[2, 3, 1, 3, 7], [1, 1, 2, 2, 3]], dtype=np.int32)
In [4]: sess = tf.InteractiveSession()

In [5]: for idx, row in enumerate(tf.unstack(arr)):
   ...:     freq = tf.bincount(row)
   ...:     print(freq.eval())
   ...:     
[01120001]
[0221]

Getting Started with Python

Tensorflow: Bincount With Axis Option

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Post a Comment for "Tensorflow: Bincount With Axis Option"