Tensorflow: Bincount With Axis Option
Solution 1:
A simple way I found to do this is to take advantage of broadcasting to compare all values in the tensor against the pattern [0, 1, ..., length - 1]
, and then count the number of "hits" along the desired axis.
Namely:
defbincount(arr, length, axis=-1):
"""Count the number of ocurrences of each value along an axis."""
mask = tf.equal(arr[..., tf.newaxis], tf.range(length))
return tf.math.count_nonzero(mask, axis=axis - 1if axis < 0else axis)
x = tf.convert_to_tensor([[2,3,1,3,7],[1,1,2,2,3]])
bincount(x, tf.reduce_max(x) + 1, axis=1)
returns:
<tf.Tensor: id=406, shape=(2, 8), dtype=int64, numpy=
array([[0, 1, 1, 2, 0, 0, 0, 1],
[0, 2, 2, 1, 0, 0, 0, 0]])>
Solution 2:
A solution for this is given for numpy array:Apply bincount to each row of a 2D numpy array.
Make every row unique by adding row_id * (max + 1)
to each row, and then find bincount
for the flattened 1d-array and then reshaping it appropriately.
For TF
make the following changes:
x = tf.placeholder(tf.int32, [None, None])
max_x_plus_1 = tf.reduce_max(x)+1
ids = x + max_x_plus_1*tf.range(tf.shape(x)[0])[:,None]
out = tf.reshape(tf.bincount(tf.layers.flatten(ids),
minlength=max_x_plus_1*tf.shape(x)[0]), [-1, N])
tf.Session().run(out, feed_dict = {x:[[2,3,1,3,7],[1,1,2,2,3]]})
#[[0, 1, 1, 2, 0, 0, 0, 1],
#[0, 2, 2, 1, 0, 0, 0, 0]]
Solution 3:
I needed this myself and wrote a little function for it since there isn't a official implementation.
defbincount(tensor, minlength=None, axis=None):
if axis isNone:
return tf.bincount(tensor, minlength=minlength)
else:
ifnothasattr(axis, "__len__"):
axis = [axis]
other_axis = [x for x inrange(0, len(tensor.shape)) if x notin axis]
swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.bincount(x, minlength=minlength), flat)
res = tf.reshape(count, [*np.take([-1if a isNoneelse a for a in tensor.shape.as_list()], other_axis), minlength])
return res
In there is a lot of handling for different edge cases.
The gist of this solution is the following part:
swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.bincount(x, minlength=minlength), flat)
- The
transpose
operation moves all axis that you want tobincount
to the end of the tensor. For example if you would have a matrix that looks like[100, 50, 20]
with axis[0, 1, 2]
and you would like thebincount
for axis1
, this operation would swap axis 1 to the end and you would get a[100, 20, 50]
matrix. - The
reshape
operation flattens all other axis for which you don't want abincount
to a single dimension / axis. - The
map_fn
operation maps abincount
onto every entry of the flattened dimension / axis.
You have to specify the minlength
parameter. This is needed so all bincount
results have the same length (or else the matrix wouldn't have a valid shape). This is probably the max value for your tensor
. For me it was better to pass it as parameter since I already had this value and didn't need to retrieve it but you could also calculate it with tf.reduce_max(tensor)
.
The full solution additionally reshapes it back to restore the other axes.
It also supports multiple axes and a single None
axis in the tensor (for batching).
Solution 4:
tf.bincount()
accepts an array as argument but it aggregates the count across the array and doesn't work along some axis, at the moment. For example:
In [27]: arr
Out[27]:
array([[2, 3, 1, 3, 7],
[1, 1, 2, 2, 3]], dtype=int32)
In [28]: x = tf.placeholder(tf.int32, [None, None])
...: freq = tf.bincount(x)
...: tf.Session().run(freq, feed_dict = {x:arr})
# aggregates the count across the whole array
Out[28]: array([0, 3, 3, 3, 0, 0, 0, 1], dtype=int32)
# 0 occurs 0 times
# 1 occurs 3 times
# 2 occurs 3 times
# 3 occurs 3 times and so on..
So, at least as of now, there no way to pass the axis information to tf.bincount()
.
However, a slightly inefficient way would be to pass one row at a time to tf.bincount()
and get the results. And then finally combine these resultant 1D arrays as an array of desired dimensionality.
I'm unsure whether this is the most efficient way but anyway here is one way to loop over the tensor (along axis 0)
In [3]: arr = np.array([[2, 3, 1, 3, 7], [1, 1, 2, 2, 3]], dtype=np.int32)
In [4]: sess = tf.InteractiveSession()
In [5]: for idx, row in enumerate(tf.unstack(arr)):
...: freq = tf.bincount(row)
...: print(freq.eval())
...:
[01120001]
[0221]
Post a Comment for "Tensorflow: Bincount With Axis Option"