No, the 1D case will mostly save on the fact that it transfers 2x times less data from the vram to the chip. The up to 2x times increase in performance was mainly related to 2D and 3D cases, where only 1/4 or 1/8 of the data is nonzero. In 2D, when doing 1D FFTs over x-axis, we omit sequences after Ny/2 because we know they are full of 0 and thus their result will be 0. So we do 0.5Ny x-axis ffts and full Nx y-axis ffts. For a square system this will mean a drop from 2N to 1.5N sequences. In 3D the drop will be even bigger, from 3N^2 to (1/4+1/2+1)=1.75N^2 sequences (almost 2x).