The Reduced Instruction Set of all chips in the ARM family - from the ARM2 to the StrongARM - includes weird and wonderful instructions like MLA (Multiply with Accumulate: multiply two registers and add the contents of a third to the result) and ASL (Arithmetic Shift Left: absolutely identical to the Logical Shift Left instruction). Later chips contain a few additional instructions. However, so far as I am aware, room has never been found in the instruction set for something that would have been very useful - a DIV instruction.

12 - 4 = 8 8 - 4 = 4 4 - 4 = 0

When I was very little, I was allowed to play with an old-fashioned mechanical calculating machine. The front of the machine had an array of vertical dials, like the ones in a combination lock, on which you set up the numbers you wanted to calculate, and there was a handle on one side which was wound away from you in order to add the current number to the total on the display or towards you in order to subtract it. In order to carry out division, it was necessary to set up the first number and then to subtract the second number from it repeatedly (counting how many times you turned the handle!), just as we did in the sum above. Obviously however, this could be extremely slow if you were doing a sum like 128÷4 since the answer is 32, this is the number of times you would have to turn the handle!

It would be quite possible to carry out division in ARM code using this simple technique, by constructing a loop like this:

MOV R1,#128 ;divide R1 MOV R2,#4 ;by R2 MOV R0,#0 ;initialise counter .subtract SUBS R1,R1,R2 ;subtract R2 from ;R1 and store ;result back in ;R1 setting flags ADD R0,R0,#1 ;add 1 to counter, ;NOT setting flags BHI subtract ;branch to start of ;loop on condition ;Higher, i.e. R1 is ;still greater than ;R2. Answer now in R0

Since even the slowest ARM processor is far faster than a child turning a handle attached to a stiff set of gears, this might even be an acceptable solution. But there is a trick you normally use to save work on a calculating machine - and you can use it on a computer, too.

Using our example of 128÷4, you would end up doing this:

- Shift 4 one place to the left - it becomes 40.
- Shift 4 another place to the left - it becomes 400. This is bigger than 128, so we have gone too far. We shall start with the previous result, which was 40. So:
- 128 - 40 = 88
*(turning the handle once)* - 88 - 40 = 48
*(turning it twice)* - 48 - 40 = 8
*(turning it three times)* - We can no longer subtract 40, so we have now found the first digit of the answer to be 3.
- Now we shift our number back to the right so that it becomes 4 again, and once more start counting turns of the handle.
- 8 - 4 = 4
*(turning the handle once)* - 4 - 4 = 0
*(turning it twice)*

We can no longer subtract 4 (our original number), so we have found the last digit of the answer to be 2. In other words, the answer is 32, the result we got before - but we have had to wind the handle round only five times in order to obtain it, not thirty-two times!

You will have noticed that this is almost exactly the method one uses when doing division using pen and paper... "four into one won't go... four into twelve goes three times... four into eight goes twice... answer is thirty-two." The difference is that we know, through experience, that 12 is 4 × 3 and 8 is 4 × 2. Machines, even electronic ones like computers, don't have this advantage - they don't contain a reference set of multiplication tables, so they have to do it the hard way.

Every time we shift our number one place to the right to obtain the next digit of the answer, we know that we will be able to subtract it either exactly once or not at all. We only have to carry out one subtraction per shift; the disadvantage is that since binary numbers have many more digits than their decimal equivalents we have to carry out many more shifts.

MOV R1,#128 MOV R0,R1,LSR#2 ;shift R1 2 places ;to the right & ;store in R0 ;answer now in R0

Since 4 is 2 × 2, all we have to do to divide by 4 in binary is to shift the register two places to the right, just as all we have to do to divide by 100 (10 × 10) in decimal is to shift two places to the right - e.g. from 600 pence we get 6 pounds.

**To divide 50 (%110010) by 10 (%1010) in binary:**

- We shift 10 as far as we can (two places) to the left until it becomes %101000 and subtract this from %110010 to get the first digit.
- %110010- %101000 = %1010
*(First digit is 1)* - Now we shift %101000 back one place to the right and try to subtract %10100 from what remains of the 25 we started off with.
- %1010 - %10100
*(Next digit is 0 - it 'won't go'!)* - Shift right, get %1010 and try again:
- %1010 - %1010 = %0
*(Successful subtraction - next digit is 1)*

Our '10' has now been shifted back two places to the right, returning it to its original value, which is our signal to stop and count up the digits in our answer - %101 in binary or '5' in decimal, w ich is of course the correct answer.

CMP R2, #0 BEQ divide_end ;check for divide by zero!

MOV R0,#0 ;clear R0 to accumulate result MOV R3,#1 ;set bit 0 in R3, which will be ;shifted left then right .start CMP R2,R1 MOVLS R2,R2,LSL#1 MOVLS R3,R3,LSL#1 BLS start ;shift R2 left until it is about to ;be bigger than R1 ;shift R3 left in parallel in order ;to flag how far we have to go

R0 will be used to hold the result. The role of R3 is more complicated.

In effect, we are using R3 to mark where the right-hand end of R2 has got to - if we shift R2 three places left, this will be indicated by a value of %1000 in R3. However, we also add it to R0 every time we manage a successful subtraction, since it marks the position of the digit currently being calculated in the answer. In the binary example (50 ÷ 10) above, we shifted the '10' two places left, so at the time of the first subtraction, R3 would have been %100, at the time of the second (which failed) it would have been %10, and at the time of the third %1. Adding it to R0 after each successful subtraction would have given us, once again, the answer of %101!

ARM code doesn't have the specific shift and rotate instructions present in non-RISC
instruction sets. Instead it has the concept of the 'barrel shifter' which can be used to
modify the value specified by the right-hand register for almost *any* instruction,
without altering that register itself. For example, the instruction ADD R0, R1, R2, LSL#2 will
add together R1 and (R2<<2) and load the result into R0, without affecting the value of
R2 in any way. This can be very useful, but it means that if we actually want to change the
value of R2 by shifting it, as we do here, we have to resort to moving it into itself via the
shifter: MOV R2,R2,LSL#1.

.next CMP R1,R2 ;carry set if R1>R2 (don't ask why) SUBCS R1,R1,R2 ;subtract R2 from R1 if this would ;give a positive answer ADDCS R0,R0,R3 ;and add the current bit in R3 to ;the accumulating answer in R0

In ARM code subtraction (a CMP instruction simulates a subtraction in order to set the flags),
if R1 - R2 gives a positive answer and no 'borrow' is required, the carry flag is *set*.
This is required in order to make SBC (Subtract with Carry) work properly when used to carry
out a 64-bit subtraction, but it is confusing!

In this case, we are turning it to our advantage. The carry flag is set to indicate that a
successful subtraction is possible, i.e. one that doesn't generate a negative result, and the
two following instructions are carried out only when the condition Carry Set applies. Note that
the 'S' on the end of these instructions is part of the 'CS' condition code and does *not*
mean that they set the flags!

MOVS R3,R3,LSR#1 ;Shift R3 right into carry flag MOVCC R2,R2,LSR#1 ;and if bit 0 of R3 was zero, also ;shift R2 right BCC next ;If carry not clear, R3 has shifted ;back to where it started, and we ;can end .divide_end MOV R25, R24 ;exit routine

The next two instructions shift right R3, the 'counter' register, and R2, which holds the number we are dividing by. We specifically set the flags by using the 'S' suffix when shifting R3 since we want to know when the bit held in this register reaches the right-hand side. During a shift to the right, bit 0 is transferred to the carry flag while the rest of the bits move along. Since only one bit of R3 is set (it was originally loaded with %1 before being shifted left and then right), when the carry flag is set it indicates that, before the shift, the value of R3 was %1, i.e. we have shifted back to where we started and R0 should now hold the right answer.

As with the results of integer division in Basic, the value in R0 will always be rounded to the
next *lowest* whole number rather than to the nearest number. For example, 1156 ÷
19 gives a result of '60 remainder 16' which is actually closer to 61 than 60.

Source: |
Archive Magazine 14.2 |

Publication: |
Archive Magazine |

Contributor: |
Harriet Bazley |