Wednesday, April 1, 2009

Modulo Operator Considered Harmful

The way I think of the modulo operator is as a remainder. 10 mod 3 = 1, because 3 * 3 = 9 and 10 - 9 = 1. But what if one of the operands is negative? What is -10 mod 3?

My naive answer was that this is obvious. -10/3 = -3. -3 * 3 = -9. To get from -9 to -10, you need to add -1. Therefore the remainder is -1. Let's see what C thinks of that:

#include 

int main(void) {
  printf(" 10 mod  3 = %d\n", 10%3);
  printf("-10 mod  3 = %d\n", -10%3);
  printf(" 10 mod -3 = %d\n", 10%-3);
  printf("-10 mod -3 = %d\n", -10%-3);

  return 0;
}

$ gcc -o mod mod.c
$ ./mod
 10 mod  3 = 1
-10 mod  3 = -1
 10 mod -3 = 1
-10 mod -3 = -1
C agrees! And I'm sure Python must follow, right?
#!/usr/bin/env python

print " 10 mod  3 = ", 10 % 3
print "-10 mod  3 = ", -10 % 3
print " 10 mod -3 = ", 10 % -3
print "-10 mod -3 = ", -10 % -3

$ ./mod.py
 10 mod  3 =  1
-10 mod  3 =  2
 10 mod -3 =  -2
-10 mod -3 =  -1
2?!?!

There's a couple ways to look at this. First, what really is the answer of -10/3 in integer division? The "real" answer (seewhatididthere) is -3.333.... When I round that to an integer, do I round "down" meaning "to the left on the number line" or "down" meaning "towards zero"? If I mean the latter, -10/3 = -3. But if I mean the first meaning, then -10/3 = -4. Then 3*-4 = -12 and so the remainder is actually 2. (If you ask Python what -10/3 is, you do in fact get -4, btw.)

Alternatively, I could revert to doing modular arithmetic under it's old name of "clock arithmetic". For 10 mod 3, the face has 3 numbers and I'm going to take 10 steps around it, starting at 0 and moving clockwise. I end up on 1. For -10 mod 3, I'm going to go counterclockwise. I end up on 2.

To me, this second interpretation is better. For one thing, rounding leftwards seems to be as counterintuitive as the original problem. For another, this clock method explains what a 10 % -3 should be. The clock face contains the space of possible answers. Doing mod 3, the answers can only be 0, 1 or 2. Doing mod -3, the answers can only be 0, -1 and -2.

Wikipedia agrees with Python, which seems to leave C out in the cold. Its folk understanding of modulo as remainder is common sense but wrong. And it isn't alone. (Some languages have mod vs rem, with the latter being a remainder. That's a good idea. Naturally FORTRAN has called the two operators mod and modulo. Stay classy, FORTRAN.)

However, it is still true that if you are working only with positive numbers, mod is a remainder. Because of this intuitive understanding, and because of the obvious confusion among language designers, it seems a good idea to me to always frame mod operations in positive space.

1 comment: