I am quite used to Intel-format inline assembly. Does anyone knows how to convert the two AT&T lines into Intel format in the code below? It is basically loading local variable's address into a register.
int main(int argc, const char *argv[]){
float x1[256];
float x2[256];
for(int x=0; x<256; ++x){
x1[x] = x;
x2[x] = 0.5f;
}
asm("movq %0, %%rax"::"r"(&x1[0])); // how to convert to Intel format?
asm("movq %0, %%rbx"::"r"(&x2[0])); // how to convert to Intel format?
asm(".intel_syntax noprefix\n"
"mov rcx, 32\n"
"re:\n"
"vmovups ymm0, [rax]\n"
"vmovups ymm1, [rbx]\n"
"vaddps ymm0, ymm0, ymm1\n"
"vmovups [rax], ymm0\n"
"add rax, 32\n"
"add rbx, 32\n"
"loopnz re"
);
}
Specifically, loading on-stack local variables using mov eax, [var_a] is allowed when compiled in 32-bit mode. For example,
// a32.cpp
#include <stdint.h>
extern "C" void f(){
int32_t a=123;
asm(".intel_syntax noprefix\n"
"mov eax, [a]"
);
}
It compiles well:
xuancong@ubuntu:~$ rm -f a32.so && g++-7 -mavx -fPIC -masm=intel -shared -o a32.so -m32 a32.cpp && ls -al a32.so
-rwxr-xr-x 1 501 dialout 6580 Aug 28 09:26 a32.so
However, the same syntax is not allowed when compiled in 64-bit mode:
// a64.cpp
#include <stdint.h>
extern "C" void f(){
int64_t a=123;
asm(".intel_syntax noprefix\n"
"mov rax, [a]"
);
}
It does not compile:
xuancong@ubuntu:~$ rm -f a64.so && g++-7 -mavx -fPIC -masm=intel -shared -o a64.so -m64 a64.cpp && ls -al a64.so
/usr/bin/ld: /tmp/cclPNMoq.o: relocation R_X86_64_32S against undefined symbol `a' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
So is there some way to make this work without using input:output:clobber, because simple local variables or function arguments can be accessed directly via mov rax, [rsp+##] or mov rax, [rbp+##] without clobbering other registers?
loopnzinstruction?!! Especially when it makes no sense to test foradd rbx, 32having set ZF, justdec ecx/jnzwould be the sane option. If this is the kind of asm you're writing by hand, you should really just switch to intrinsics. As well as inefficient, this is super broken because you don't declare clobbers on registers you modify, or tell the compiler about the memory you read and write. Expect this to break, especially if compiled with optimization enabled."mov eax, [a]"doesn't generate the code you think it does. If you were to review the generated code you would discover that it generated a mov from a memory operand that wasn't on the stack. GCC inline assembly doesn't support accessing variables directly. The GCC manual has a warning it isn't supported even for global variable (not on the stack). I assume from your question that you may have been developing on MSVC using Microsoft's inline assembly?clang++and the-fms-extensionsoption you would be able to generate accesses to variable outside the inline assembly. You'd code it like MSVC:asm { mov rax, [a] }. GCC doesn't support MS extensions.