In speaking with one of my professors this afternoon, and in re-reading the one-way hash chapter in Bruce Schneier’s excellent book Applied Cryptography, I’ve discovered that the only reasonable way of writing my hashing algorithm is to stick to chunks that are register-sized, or at most the size of all of the available registers (4 on IA32). Implementing such an algorithm, then, will limit me to 128-bit operations on an x86 processor, and having to join the two halves of the hash together to build the entire 256-bit signature. While this isn’t ideal, it’s substantially faster than the methods I wrote about yesterday, which amount to building a virtual machine for my algorithm to run inside of.
I am now reworking my algorithm a bit to comply with these new restrictions. Until somebody goes off and builds a processor for general use like the one I describe in my second Connexions paper (though IBM’s new Cell architecture is close), I’ll need to work within these restricted rules. I’m going to write reference implementations of the algorithm for x86, x86-64, and PowerPC architectures over the next several days, and publish them along with the design criteria and algorithm specification here on my website.
“Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house and a collection of facts is not necessarily science.” –Henri Poincare