Assembly x86_64: Getting input integer from user and print it

Question

I'm trying to write a program that works in the following way:

Get a number input by the user -> divide it by, say, 2 -> print the result (quotient).

The divide-by-the-number-2 part doesn't seem to represent too much difficulty, so preliminarily I wrote a program that gets an integer input by user and print that integer.

That means I tried to write a program that convert the string integer of the user to a real integer and then convert it back to a string and print it.

But after compiling I end up in a infinite loop (after hitting Enter nothing happens).

I compile using the following commands:

nasm -f elf64 ascii.asm -o ascii.o

ld ascii.o -o ascii

./ascii

In the code below the subroutine _getInteger is intended for converting from string to integer and the subroutine _appendEOL and _loopDigit for the whole convertion from integer to string.

section .bss
        ascii resb 16           ; holds user input
        intMemory resb 100      ; will hold the endline feed 
        intAddress resb 8       ; hold offset address from the intMemory

section .data
        text db "It's not an integer", 10
        len equ $-text

section .text

        global _start

_start:

        call _getText
        call _toInteger
        call _appendEOL

mov rax, 60
        mov rdi, 0
        syscall

_getText:
        mov rax, 0
        mov rdi, 0
        mov rsi, ascii
        mov rdx, 16
        syscall
        ret

_toInteger:
        mov rbx,10      ; for decimal scaling
        xor rax, rax    ; initializing result
        mov rcx, ascii  ; preparing for working with input
        movzx rdx, byte [rcx]   ; getting first byte (digit)
        inc rcx         ; for the next digit

        cmp rdx, '0'    ; if it's less than '0' is not a digit
        jb _invalid

        cmp rdx, '9'    ; if it's greater than '9' is not a digit
        ja _invalid

        sub rdx, '0'    ; getting decimal value
        mul rbx         ; rax = rax*10
        add rax, rdx    ; rax = rax + rdx
        jmp _toInteger  ; repeat
        ret

_invalid:
        mov rax, 1
        mov rdi, 1
        mov rsi, text
        mov rdx, len
        syscall
        ret

_appendEOL:
        ; getting EOL
        mov rcx, intMemory
        mov rbx, 10 ; EOL
        mov [rcx], rbx
        inc rcx
        mov [intAddress], rcx

_loopDigit:
        xor rdx, rdx
        mov rbx, 10
        div rbx
        push rax
        add rdx, '0'
        mov rcx, [intAddress]
        mov [rcx], dl
        inc rcx
        mov [intAddress], rcx
        pop rax
        cmp rax, 0
        jne _loopDigit

_printDigit:
        mov rcx, [intAddress]

        mov rax, 1
        mov rdi, 1
        mov rsi, rcx
        mov rdx, 1
        syscall
        mov rcx, [intAddress]
        dec rcx
        mov [intAddress], rcx
        cmp rcx, intMemory
        jge _printDigit

        ret

sounds like the time has come to learn about GDB (or other debuggers) — Tommylee2k
– Tommylee2k, Commented May 9, 2017 at 14:27
@Tommylee2k What does it mean (: ? Did I get something wrong in the code above? I've spent two weeks aprox. with Assembly so I know almost nothing of too little. I didn't quite get what you mean ):. — asd
– asd, Commented May 9, 2017 at 14:44
It means that whether the source does compile is almost completely insignificant if you are trying to prove the code correctness (except that any compile time failure rules that out immediately). Actually even running executable and receiving expected result is still miles away from proving correctness. To get at least a bit closer to that elusive utopian goal, you should rather use debugger to step your code instruction by instruction, testing the current machine status after each executed instruction against your expectations/design, and testing different inputs, like null/empty/large/... — Ped7g
– Ped7g, Commented May 9, 2017 at 15:48

rkhb · Accepted Answer · 2017-05-09 16:12:06Z

3

_toInteger is an endless loop which checks forever the first digit. You need a better loop entry and a break condition.

The next issue is mul rbx. This instruction changes also EDX which is needed to be added to RAX a line below. If you don't want to use IMUL rax,rax,10 you can use arithmetic ability of LEA:

add rax, rax                ; RAX = RAX * 2
lea rax, [rax + rax * 4]    ; RAX = (former RAX * 2) + (former RAX * 8)

Another issue is the tricky behaviour of the SYS_READ syscall in _getText. You won't get a C-style string with a null terminator. SYS_READ fills the buffer with at the end with \n - if there is enough place according to RDX. Sometimes \n, sometimes not - this is not a useful break condition for _toInteger. A woirkaround is to nullify the last byte from SYS_READ no matter if it is \n or a digit. This shortens the available buffer by 1.

_getText:
    mov rax, 0
    mov rdi, 0
    mov rsi, ascii
    mov rdx, 16
    syscall
    mov byte [ascii-1+rax], 0
    ret

Be prepared for further surprises given to you by SYS_READ. The break condition is now a null. Let's do it:

_toInteger:
    mov rbx,10      ; for decimal scaling
    xor rax, rax    ; initializing result
    mov rcx, ascii  ; preparing for working with input

    .LL1:           ; loops the bytes
    movzx rdx, byte [rcx]   ; getting current byte (digit)

    test rdx, rdx   ; RDX == 0?
    jz .done        ; Yes: break

    inc rcx         ; for the next digit

    cmp rdx, '0'    ; if it's less than '0' is not a digit
    jb _invalid

    cmp rdx, '9'    ; if it's greater than '9' is not a digit
    ja _invalid

    sub rdx, '0'    ; getting decimal value

    ; mul rbx         ; rax = rax*10
    add rax, rax
    lea rax, [rax + rax * 4]

    add rax, rdx    ; rax = rax + rdx

    ;jmp _toInteger  ; repeat
    jmp .LL1  ; repeat

    .done:
    ret

Just a caveat: _toInteger returns with the integer in RAX, but you don't save this value. The next write operation on EAX will destroy it.

answered May 9, 2017 at 16:12

rkhb

14.5k7 gold badges36 silver badges61 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

asd Over a year ago

Sorry for the late reply, but I was getting a bit familiar with GDB (a pretty neat thing, really). I got 4 questions, so I appreciate your guidance: 1) In this intended program, in which way could mul rbx change the edx register other than due to an overflow after a multiplication? (maybe there is something about the bytes I haven't considered in this question). 2) I understand that lea rax, [rax + rax * 4] after each loop is shifting the value of rax 2 bits to the left in order to get some space for the next digit. Is that correct?

asd Over a year ago

3) The issue with the SYS_READ that you've pointed out is due to the fact that I arbitrarily set the length of the string input by user, right?. 4) I read that imul clears overflow (if I'm not mistaking that would mean that edx doesn't get changed), but why should be used the three-operand form of imul and not just the one-operand (or two-operand).

rkhb Over a year ago

@Jazz: This is a worldwide site. Even a delay of 24 hours is not impolite. 1) mul rbx produces a 64-bit result in EDX:EAX, not only an overflow. 2) That LEA line together with the preceding line is just a way to perform EAX = EAX * 10. It doesn't change EDX, neither the flags. 3) The issue with SYS_READ is an issue with SYS_READ. You are absoletely innocent. I bumped three questions about SYS_READ to the top. Click on the nasm tag and then on "active" and you see those questions direct below your question (modified by me)..

rkhb Over a year ago

@Jazz: 4) The one-operand form of IMUL changes EDX as well. You can of course also use the two-operand form. You are in the world of assembly. Here doesn't exist a standard, ideology or religion - if it works, it's fine. You need a detailed reference, the Intel Manual.

asd Over a year ago

Wow. There is a lot of cool stuff to learn. Thanks for answering the questions. And I'm carefully reading those about SYS_READ. I hope to fix the issues during these days. If not, I'll be creating another question xD (but hopefully with a better ground). Again, thanks for taking the time.

InfinitelyManic · Accepted Answer · 2017-05-09 15:50:34Z

1

Your "infinite loop" is in your _toInteger function.

RDX will be 0 or the value of your first element or ASCII input because you are resetting your pointer to the first element by jumping back to the label _toInteger. Therefore, you can never leave or jump out of the loop.

We can't stress this enough; you should always use a debugger.

 mov rcx, ascii  ; preparing for working with input

But even if you resolve that issue, there appears to be be other problems with the _toInteger function.

$ ./jazz_001
12
It's not an integer
20

answered May 9, 2017 at 15:50

InfinitelyManic

8221 gold badge7 silver badges13 bronze badges

Collectives™ on Stack Overflow

Assembly x86_64: Getting input integer from user and print it

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related