Why Is Numpy Subtraction Slower On One Large Matrix $m$ Than When Dividing $m$ Into Smaller Matrices And Then Subtracting?

August 11, 2023 Post a Comment

I'm working on some code where I have several matrices and want to subtract a vector $v$ from each row of each matrix (and then do some other stuff with the result). As I'm using N

Solution 1:

The type of the variable pre_allocated is float8. The input matrices are int. You have an implicit conversion. Try to modify the pre-allocation to:

pre_allocated = np.empty_like(large_matrix)

Before the change, the execution times on my machine were:

0.6756095182868318
1.2262537249271794
1.250292605883855

After the change:

0.6776479894965846
0.6468182835551346
0.6538956945388001

The performance is similar in all cases. There is a large variance in those measurements. One may even observe that the first one is the fastest.

It seams that there is no gain due to pre-allocation.

Baca Juga

Note that the allocation is very fast because it reserves only address space. The RAM is consumed only on access event actually. The buffer is 20MiB thus it is larger that L3 caches on the CPU. The execution time will be dominated by page faults and refilling of the caches. Moreover, for the first case the memory is re-allocated just after being freed. The resource is likely to be "hot" for the memory allocator. Therefore you cannot directly compare solution A with others.

Modify the "action" line in the first case to keep the actual result:

        np.subtract(list_of_matrices[j], vector, out=pre_allocated[m*j:m*(j+1)])

Then the gain from vectorized operations becomes more observable:

0.8738251849091547
0.678185239557866
0.6830777283598941

Getting Started with Python

Why Is Numpy Subtraction Slower On One Large Matrix $m$ Than When Dividing $m$ Into Smaller Matrices And Then Subtracting?

Solution 1:

Post a Comment for "Why Is Numpy Subtraction Slower On One Large Matrix $m$ Than When Dividing $m$ Into Smaller Matrices And Then Subtracting?"