Skip to content Skip to sidebar Skip to footer

Boolean Indexing But Turns Out To Be Some Other Operation

I was trying to do boolean indexing but.. np.random.randn(8).reshape((4,2)) Out[11]: array([[-1.13058416, 1.08397186], [-1.2730122 , 0.78306498], [-0.05370502, -1.

Solution 1:

It's easy to think of ndarrays as buffed-up lists. Broadcasting and operations on arrays are automatically extended to lists involved in those operations, so you can add an array and a list of broadcast-compatible shapes, and numpy won't try to concatenate the two (as it would try to do with two lists).

One huge (and, for me, confusing) exception is fancy indexing. Fancy indexing itself is already confusing to me (as someone coming from MATLAB), since it's odd that the following two give a different result:

import numpy as np
A = np.random.rand(3,3)
A[0:1,0:1]
A[range(2),range(2)]

The former is a slicing operation, and returns a 2-by-2 submatrix. The latter is a case of fancy indexing, and returns only a 2-element array, containing A[0,0] and A[1,1].

Your question is related to something equally odd: lists and arrays of boolean values behave differently when used in fancy indexing. Consider the following two examples, along the lines of your question:

A = np.random.rand(4,2)
bool_index_list = [False, True, True, False]
bool_index_array = np.array(bool_index_list)
A[bool_index_list].shape
A[bool_index_array].shape

The former returns (4,2), the latter (2,2).

In the former case, since the index is a list, the boolean values are converted to corresponding integers, and the resulting values of [0,1,1,0] are used as actual indices in the matrix, returning the [first,second,second,first] row, respectively.

In the latter case, the index array of dtype=bool is used as you would expect it to be: it is used as a mask to ignore those rows of A for which the index is False.

The numpy release notes, among other things, indicate that

In the future Boolean array-likes (such as lists of python bools) will always be treated as Boolean indexes and Boolean scalars (including python True) will be a legal boolean index.

Correspondingly, the list-based indexing cases above give me the following warning in numpy 1.10.1:

FutureWarning: in the future, boolean array-likes will be handled as a boolean array index

So the short answer to your question is that it's legit, but not for long. Stick to ndarray-based fancy indexing, and you should experience no bumps along the way.

Post a Comment for "Boolean Indexing But Turns Out To Be Some Other Operation"