优化4D Numpy阵列结构

debugcn 发表于 Dev

乔·弗利普

我有一个data形状为（50,8,2048,256）的4D数组，该数组是包含8个2048x256像素图像的50组。times是形状（50,8）的数组，给出了拍摄每张图像的时间。

我为每个组中的所有图像在每个像素处计算一阶多项式拟合，从而得到形状数组（50,2048,256,2）。这实际上是50个组中每个组的向量图。我用于存储多项式的代码是：

fits = np.ones((50,2048,256,2))
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):    
    for xpos in range(2048):
        for ypos in range(256):
            px_data = data[:,:,ypos,xpos]
            fits[group,ypos,xpos,:] = np.polyfit(times[group,:,ypos,xpos],data[group,:,ypos,xpos],1)

现在的挑战是，我想生成一个new_data形状为（50,12,2048,256）的数组，在该数组中，我使用fromfits和from的倍数系数new_time生成50组的12张图像。

我认为我可以使用类似的方式np.polyval(fits, new_time)来生成图像，但是我对如何表达它们感到非常困惑。应该是这样的：

new_data = np.ones((50,12,2048,256))
for i,(times,fit) in enumerate(zip(new_times,fits)):
    new_data[i] = np.polyval(fit,times)

但是我遇到广播错误。任何帮助将不胜感激！

更新OK，所以我稍稍更改了代码，使其可以正常工作并完全执行我想要的操作，但是在所有这些循环中它都非常慢（每组大约1分钟，这意味着我要花近一个小时才能运行！）。谁能建议一种优化此方法以加快速度的方法？

# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048,256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):
    for xpos in range(2048):
        for ypos in range(256):
            fits[group,xpos,ypos] = np.polyfit(times[group,:,xpos,ypos],data[group,:,xpos,ypos],1)

# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.ones((50,12,2048,256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
    for img in range(12):
        for xpos in range(2048):
            for ypos in range(256):
                new_data[group,img,xpos,ypos] = np.polynomial.polynomial.polyval(times[group,img,xpos,ypos],fits[group,xpos,ypos])

那

关于速度，我看到很多循环，由于numpy的优美性，应该并且经常可以避免这种循环。如果我完全理解您的问题，则需要对50组8个数据点2048 * 256次拟合一阶多项式。因此，为了合适，图像的形状不起作用。因此，我的建议是将图像弄平，因为这样np.polyfit可以同时适合一个x值范围和多个y值集

从文档字符串

x : array_like, shape (M,)
    x-coordinates of the M sample points ``(x[i], y[i])``.
y : array_like, shape (M,) or (M, K)
    y-coordinates of the sample points. Several data sets of sample
    points sharing the same x-coordinates can be fitted at once by
    passing in a 2D-array that contains one dataset per column.

所以我会去

# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048*256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
data_fit = data.reshape((50,8,2048*256))
for group in range(50):
    fits[group] = np.polyfit(times[group],data_fit[group],1).T
fits_original_shape = fits.reshape((50,2048,256,2))

因为您希望在最后一个索引中包含参数，但np.polyfit首先要包含它们，然后才是不同的数据集，所以转置是必需的

然后对其进行评估，基本上还是相同的技巧：

# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.zeros((50,12,2048*256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
#times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
    new_data[group] = np.polynomial.polynomial.polyval(times[group],fits[group].T).T
new_data_original_shape = new_data.reshape((50,12,2048,256))

由于参数与不同数据集的顺序，再次需要两次转置，以便与数组的形状匹配。

也许还可以通过一些高级的numpy魔术来避免组之间的循环，但是有了它，代码已经运行得更快了。

希望对您有所帮助！

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。