I am quite used to Intel-format inline assembly. Does anyone knows how to convert the two AT&T lines into Intel format in the code below? It is basically loading local variable's address into a register.
int main(int argc, const char *argv[]){
float x1[256];
float x2[256];
for(int x=0; x<256; ++x){
x1[x] = x;
x2[x] = 0.5f;
}
asm("movq %0, %%rax"::"r"(&x1[0])); // how to convert to Intel format?
asm("movq %0, %%rbx"::"r"(&x2[0])); // how to convert to Intel format?
asm(".intel_syntax noprefix\n"
"mov rcx, 32\n"
"re:\n"
"vmovups ymm0, [rax]\n"
"vmovups ymm1, [rbx]\n"
"vaddps ymm0, ymm0, ymm1\n"
"vmovups [rax], ymm0\n"
"add rax, 32\n"
"add rbx, 32\n"
"loopnz re"
);
}
Specifically, loading on-stack local variables using mov eax, [var_a] is allowed when compiled in 32-bit mode. For example,
// a32.cpp
#include <stdint.h>
extern "C" void f(){
int32_t a=123;
asm(".intel_syntax noprefix\n"
"mov eax, [a]"
);
}
It compiles well:
xuancong@ubuntu:~$ rm -f a32.so && g++-7 -mavx -fPIC -masm=intel -shared -o a32.so -m32 a32.cpp && ls -al a32.so
-rwxr-xr-x 1 501 dialout 6580 Aug 28 09:26 a32.so
However, the same syntax is not allowed when compiled in 64-bit mode:
// a64.cpp
#include <stdint.h>
extern "C" void f(){
int64_t a=123;
asm(".intel_syntax noprefix\n"
"mov rax, [a]"
);
}
It does not compile:
xuancong@ubuntu:~$ rm -f a64.so && g++-7 -mavx -fPIC -masm=intel -shared -o a64.so -m64 a64.cpp && ls -al a64.so
/usr/bin/ld: /tmp/cclPNMoq.o: relocation R_X86_64_32S against undefined symbol `a' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
So is there some way to make this work without using input:output:clobber, because simple local variables or function arguments can be accessed directly via mov rax, [rsp+##] or mov rax, [rbp+##] without clobbering other registers?