undefined | Better HN

0 pointsceleritascelery5y ago0 comments

Aren’t all reads aligned to the width of the SIMD register? If I do an AVX512 command it will read 512 bits right?

0 comments

It's about where you read data from, not how much data gets read. For example an AVX read is aligned if the address being read from is a multiple of 32 bytes, otherwise it's unaligned and runs slightly slower, and slower still if it happens to straddle two cachelines. The same applies to write instructions as well.

It's less of an issue than it used to be, the penalty for unaligned access has steadily been reduced by newer CPU architectures, but it's still there.

celeritasceleryOP5y ago

ahh, so it does come back to cache line alignment. Reading aligned data doesn't give any benefit in and of itself[1]. At least not on modern hardware. I guess the performance improvement would make sense since SIMD instructions are sized to be a multiple of the cache line size.

[1] https://lemire.me/blog/2012/05/31/data-alignment-for-speed-m...

j / k navigate · click thread line to collapse

0 comments

jsheard5y ago

It's less of an issue than it used to be, the penalty for unaligned access has steadily been reduced by newer CPU architectures, but it's still there.

celeritasceleryOP5y ago

[1] https://lemire.me/blog/2012/05/31/data-alignment-for-speed-m...

j / k navigate · click thread line to collapse