2

I need to multiply an integer ranging from 0-1023 by 1023 and divide the result by a number ranging from 1-1023 in hardware (verilog/fpga implementation). The multiplication is straight forward since I can probably get away with just shifting 10 bits (and if needed I'll subtract an extra 1023). The division is a little interesting though. Area/power arent't really critical to me (I'm in an FPGA so the resources are already there). Latency (within reason) isn't a big deal so long as I can pipeline the deisgn. There are obviously several choices with different trade offs, but I'm wondering if there's an "obvious" or "no brainer" algorithm for a situation like this. Given the limited range of operands and the abundance of resources that I have (bram etc) I'm wondering if there isn't something obvious to do.

2
  • 1
    Binary long division may be your best bet Commented Aug 7, 2013 at 18:47
  • 2
    If you have enough resources, maybe build a LUT (Look up table) for it. Commented Aug 7, 2013 at 19:10

2 Answers 2

2

If you can pre-compute everything, and you've got a spare 20x20 multiplier, and some way to store your pre-computed number, then go for Morgan's suggestion. You need to precompute a 20-bit multiplicand (10b quotient, 10b remainder), and multiply by your first 10b number, and take the bottom 30b of the 40b result.

Otherwise, the no-brainer is non-restoring division, since you say that latency isn't important (lots of stuff on the web, most of it incomprehensible). you have a 20-bit numerator (the result of your (1023 x) multiplication), and a 10-bit denominator. This gives a 20b quotient, and a 10b remainder (ie. 20 bits for the integer part of the answer, and 10 bits for the fractional part, giving a 30b answer).

The actual hardware is pretty trivial: an 11b adder/subtractor, a 31b shift register, and a 10b or 11b register to store the divisor. You also need a small FSM to control it (2b). You have to do a compare, add or subtract, and shift in every clock cycle, and you get the answer out in 21 cycles. I think. :)

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the reply @EML. I think I'm going to go with Morgan's approach (and of course with your detail). I can indeed precompute everything -- wrote a matlab script to get the coefficients/make the LUT. BTW num2fixpt is a nice function that I didn't know existed till I wrote my own version. I'm actually on a spartan 3a-dsp, which has several multipliers built in. That stated I think the latency of the multiplication is nicer than the division. I'd give you the answer vote, but Morgan was first :)
1

If you can work with fixed point precision rather than integers it may be possible to change :

divide the result by a number ranging from 1-1023

to multiplication by a number ranging from 1 - 1/1023, ie pre-compute the divide and store that as the coefficient for the multiply.

2 Comments

I think I might do something similar. The total operation is (integer ranging from 0-1023) * 1023 / (integer ranging 0-1023). Given the second half I could just reduce this to a fixed point multiplication of (integer ranging from 0-1023) * (fixed point number 1-1023). The latter half of the equation can be precomputed getting rid of the division all together. Thoughts?
@Doov that is how I would approach it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.