think about this:
the arm cpu doesn't do division, right? so.. if someone was to throw:
u32 doDiv(u32 x, u32 y)
{
return (x/y);
}
at the cpu, what would it do? short answer: it would do what you're seeing in that routine, with a lot of wasted cycles. It's also in thumb, which means it's going to be even bulkier. same thing for the modulus routine right under it. they're huge.
that's as opposed to the divmod in dppt, hgss, and bw that is done in arm, is very fast, and returns both the result and the remainder.
http://pastebin.com/u9CG4VCZ
basically, gamefreak could've done a lot better in that regard. the size and speed of the division and modulus routines in emerald are both good examples of why you shouldn't just use division in arm and should actively look for another way.
e: dunno how much you know about pokemon generation in gen3, but stuff like this is what causes methods 2 and 4 to happen.