In a large code base, I am using np.broadcast_to to broadcast arrays (just using simple examples here):
In [1]: x = np.array([1,2,3]) In [2]: y = np.broadcast_to(x, (2,1,3)) In [3]: y.shape Out[3]: (2, 1, 3)Elsewhere in the code, I use third-party functions that can operate in a vectorized way on Numpy arrays but that are not ufuncs. These functions don't understand broadcasting, which means that calling such a function on arrays like y is inefficient. Solutions such as Numpy's vectorize aren't good either because while they understand broadcasting, they introduce a for loop over the array elements which is then very inefficient.
Ideally, what I'd like to be able to do is to have a function, which we can call e.g. unbroadcast , that returns an array with a minimal shape that can be broadcasted back to the full size if needed. So e.g.:
In [4]: z = unbroadcast(y) In [5]: z.shape Out[5]: (1, 1, 3)I can then run the third-party functions on z , then broadcast the result back to y.shape .
Is there a way to implement unbroadcast that relies on Numpy's public API? If not, are there any hacks that would produce the desired result?
This is probably equivalent to your own solution, only a bit more built-in. It uses as_strided in numpy.lib.stride_tricks :
import numpy as np from numpy.lib.stride_tricks import as_strided x = np.arange(16).reshape(2,1,8,1) # shape (2,1,8,1) y = np.broadcast_to(x,(2,3,8,5)) # shape (2,3,8,5) broadcast def unbroadcast(arr): #determine unbroadcast shape newshape = np.where(np.array(arr.strides) == 0,1,arr.shape) # [2,1,8,1], thanks to @Divakar return as_strided(arr,shape=newshape) # strides are automatically set here z = unbroadcast(x) np.all(z==x) # is TrueNote that in my original answer I didn't define a function, and the resulting z array had (64,0,8,0) as strides , whereas the input has (64,64,8,8) . In the current version the returned z array has identical strides to x , I guess passing and returning the array forces a creation of a copy. Anyway, we could always set the strides manually in as_strided to get identical arrays under all circumstances, but this doesn't seem necessary in the above setup.