0

I have the below inline assembly code:

int get_year(int a, int *b, char * c)
{
    int ret, t1, t2;

    asm (
        "addl %3, %[a]                  \n\t"
        "movl %[a], %[t1]               \n\t"
        "movl $58, %%edx                \n\t"
        "movb %%dl, 0x04(%1)            \n\t"
        : [t1] "=r" (t1), "=&D" (t2)
        : [a] "r" (a), "rm" (*b), "1" (c)
        : "edx", "memory"
    );

    ret = t1;

    return ret;
}

When I compile this via llvm, error dumps:

error: unsupported inline asm: input with type 'char *' matching output with type 'int'
                : [a] "r" (a), "rm" (*b), "1" (c)
                                               ^

However, the memcpy function in linux kernel has the same format of inline assembly usage:

void *memcpy(void *dest, const void *src, size_t n)
{
    int d0, d1, d2;
    asm volatile(
        "rep ; movsl\n\t"
        "movl %4,%%ecx\n\t"
        "rep ; movsb\n\t"
        : "=&c" (d0), "=&D" (d1), "=&S" (d2)
        : "0" (n >> 2), "g" (n & 3), "1" (dest), "2" (src)
        : "memory");

    return dest;
}

and this works properly without any compile error.

6
  • Presumably the gcc compiler targeted by the kernel is not so strict. You can easily fix it by changing t2 to be char* Commented Dec 24, 2015 at 3:47
  • Compiling with the lldb debugger? Commented Dec 24, 2015 at 3:55
  • @MichaelPetch sorry, it's a clerical mistake. I use llvm .... Commented Dec 24, 2015 at 3:56
  • Lol that is a bit better, I was going to say you probably would have more significant problems than the error you gave LOL ;-) Commented Dec 24, 2015 at 3:57
  • 4
    Any chance you are trying to compile this code as 64-bit code instead of 32-bit code? (In 64-bit code size of an int (4 bytes) and a pointer (8 bytes ) are different, and might throw that error. Commented Dec 24, 2015 at 4:02

2 Answers 2

4

First of all, if you're trying to get your feet wet learning asm, GNU C inline asm is one of the hardest ways to use asm. Not only do you have to write correct asm, you have to spend a lot of time using esoteric syntax to inform the compiler of exactly what your code needs for input and output operands, or you will have a bad time. Writing whole functions in ASM is much easier. They can't be inlined, but it's a learning exercise anyway. The normal function ABI is much simpler than the boundary between C and inline ASM with constraints. See the wiki...


Besides that compile error, you have a bug: you clobber %[a], even though you told gcc it's an input-only operand.

I assume this still a "work in progress", since you could get the same result with better code. (e.g. using %edx as a scratch reg is totally unnecessary.) Of course, in the general case where this is inlined into code where a might be a compile-time constant, or known to be related to something else, you'd get better code from just doing it in C (unless you spent a lot of time making inline-asm variants for various cases.)

int get_year(int a, int *b, char * c)
{
    int ret, t1, t2;

    asm (
        "addl %[bval], %[a] \n\t"
        "movb $58, 4 + %[cval]\n\t"  // c is an "offsetable" memory operand

        : [t1] "=&r" (t1), [cval] "=o" (*c)
        : [a] "0" (a), [bval] "erm" (*b)
        : // no longer clobbers memory, because we use an output memory operand.
    );

    ret = t1;  // silly redundancy here, could have just used a as an input/output operand and returned it, since you apparently want the value
    return ret;
}

This now compiles and assembles (using godbolt's "binary" option to actually assemble). The 4 + (%rdx) produces a warning, but does assemble to 4(%rdx). IDK how to write the offset in a way that doesn't error if there's already an offset. (e.g. if the operand is *(c+4), so the generated asm is 4 + 4(%rdx), it wouldn't work to leave out the +.)

This is still using the matching-output-operand trick, but I changed to using memory or general constraints to allow compiler-time constants to end up doing a addl $constant, %edi.

This allows the compiler as much flexibility as possible when inlining. e.g. if a caller ran get_year(10, &arr[10], &some_struct.char_member), it could use whatever addressing mode it wanted for the load and store, instead of having to generate c in a single register. So the inlined output could end up being movb $58, 4+16(%rbp, %rbx) for example, instead of forcing it to use 4(%reg).

Sign up to request clarification or add additional context in comments.

6 Comments

Can you explain more about matching-output-operand trick? I don't know it clearly. BTW, I compiled your code in my macbook using apple's llvm compiler, I got <inline asm>:2:17: note: instantiated into assembly here movb $58, 4 + (%rdx)...
I have serval quesitons: 1) in [cval] "=o" (*c), why it is (*c) but (c), I think c is an address, *c is a char variable. 2) in [bval] "erm" (*b), what does e represent? I can't find it gcc manual...
re: the matching-output-operand trick. I got it from the code you posted... IDK if it has any advantages over Ross Ridge's suggestion of using using an input/output operand "+r" for inputs you want to clobber inside the asm block. I just kept that trick since I hadn't seen it before, and wanted to see what happened. I can't think of anything off right away where I'd expect the matching-output-operand method to make better code, when the output operand is just a throwaway tmp variable. Maybe if it has a different size, it makes it easier to avoid prefixes to get %eax vs. %rcx?
@DouglasSu: 1: Yes, *c is a single character, not a pointer. But I'm using a memory constraint, so gcc will substitute in an effective address for %[cval], like (%rdi), with the parentheses. This way you're telling gcc where in memory the byte gets written. Without that, you'd need to list "memory" as clobbered, forcing a reload of registers that were caching memory values. (i.e. a compiler memory barrier). 2: e is a compile-time constant that fits in a signed 32bit integer. It's in the x86 machine constraints section of the manual. This matters for 64bit code.
How about: "movb $58, 4 + 0%[cval]\n\t" . In the absence of of an offsetable value in the memory operand it would just be a 4+0 offset = 4. If there is an offset already then the offset simply has a leading 0 digit added (doesn't change the value of the offset present) and then 4 is added to it.
|
2

I can reproduce the problem if I compile your code with clang only when generating 64-bit code. When targeting 32-bit code there's no error. As Michael Petch said, this suggests the problem is the different sizes of the two operands.

It's not entirely clear what the best fix would be, as your asm statement doesn't make much sense. It's the equivalent of:

int get_year(int a, int *b, char *c) {
    a += *b;
    c[4] = 58;
    return a;
}        

There's no advantage to using an assembly statement to do what can be done more clearly and more efficiently using the C code above. So the best solution would be completely replace your code with the equivalent C code.

If you're just playing around with inline assembly, then equivalent inline assembly would be:

int get_year2(int a, int *b, char * c)
{
        asm("addl %[b], %[a]"
            : [a] "+r" (a)
            : [b] "m" (*b)
            : "cc");
        asm("movb $58, %[c4]"
            : [c4] "=rm" (c[4]));
        return a;
}

I've used two asm statements because the two parts are unrelated. Keeping them separated provides for more opportunities for optimization. For example if you call this function but don't use the return value, the compiler can eliminate the first asm statement because its result isn't used and it has no side effects.

Instead of using a matching constraints, the "1" constraint that was giving you problems, I've used the "+" constraint modifier to mark the operand as both an input and an output. I find this works better. The constraint for the [b] operand should really be "rm" but unfortunately clang doesn't handle rm constraints well.

You probably noticed that I've used only two assembly statements where your example used four. The MOVL instruction isn't necessary, the compiler can handle moving the result in to the return value register, if necessary. Your last two assembly statements can be collapsed into one single statement that moves the constant directly into memory without clobbering a register. Speaking of which, your asm statement clobbers EFLAGS, the condition codes, so "cc" should be listed a clobbered, but as Peter Cordes notes it's not necessary with x86 targets but the compiler assumes they are anyways.

2 Comments

Fun fact which was a surprise to me, too: x86 / x86-64 inline asm implicitly clobbers the flags. stackoverflow.com/questions/6659414/…. So few instructions don't affect flags that it would cause way more bugs than real useful optimization opportunities, so they decided to define it that way.
@PeterCordes Ah, I was wondering why so many apparently broken asm statements worked in practice. I just assumed it there were just few opportunities to make use of the flags across an asm statement, so it didn't come up.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.