http://3cbzkrvakrpetjjppdwzbzqrlkmzatjs7jbyazap5gwutj32gcltjpqd.onion
You can do it branch-free by means of conditional moves and such
(e.g., do two bit scans, switch between them based on whether the
lowest word is zero or not—similar for the other operations),
and there is some support from the compiler (__uint128_t
on GCC-like platforms), but in the end, going to 128 was just
not enough to end up net positive.