Pasted on September 28 2012 04:43:25
Never expires
By anonymous
// saving one 64-bit unsigned integer with little endian byte order // how hard can that be? we don't need any compiler-specific intrinsics or libraries, right? void store(uint64_t in, uint8_t * out) { for(size_t i = 0; i < 8; i++) { out[i] = uint8_t(in); in >>= 8; // or out[i] = uint8_t(in >> (i * 8)); - makes no difference } } // compiled for amd64: gcc 4.7.1 -O3: movq %rdi, %rax # copy input into RAX movb %dil, (%rsi) # and save the first input byte shrq $8, %rax # shift RAX 8 bits (so we can use AL instead of AH - because we can) movb %al, 1(%rsi) # and save the next input byte # for the next byte let's just shift RAX 8 more bits - or not movq %rdi, %rax # lolwut, why not copy the input again shrq $16, %rax # so we can use up two instructions instead of one ... # continue this madness for the remaining bytes # total instruction count: 21 clang 3.1 -O3: movb %dil, (%rsi) # and save the first input byte movq %rdi, %rax # copy input into RAX movb %ah, 1(%rsi) # and save the next input byte movq %rdi, %rax # copy input into RAX again, just for the lulz shrq $16, %rax # well, at least it's one instruction less than gcc ... # continue this madness for the remaining bytes # total instruction count: 20 # now, would one (possible unaligned) 64-bit store have been that bad? movq %rdi, (%rsi)