This is a question about SIMD instructions on AArch64 on an M1.
I am working on a routine that works entirely inside the registers. All the memory reads and writes occur outside of the main loop. The first routine loads pseudo-random bits into registers x14-x22 (excluding x18).
Other than writing those values to memory, I cannot seem to figure out how to load that series of bits to the v5-v8 vector registers without writing them to memory first. I do not want to do that. Asking me why won't be particularly helpful.
I'm sure there is a simple way to do this, but I cannot find it in any of my resources.
fmov d5, x14
rev64 v5.2d, v5.2d. <--- error!
ror q5, q5, #8 <----error!
fmov d6, x16
fmov d6, x17
fmov d7, x19
fmov d7, x20
fmov d8, x21
fmov d8, x22
In the above code, I'm able to load the lower 64 bits with what I want, but I cannot seem to figure out how to rotate the bits over.
In 32-bit arm you can stack these directly.