by kaffeemonster » 09 Dec 2010 00:38
I'm not so sure on this one.
It's a lot of Hype.
The problem with GPU processing is the Overhead. When you have to transfer a little bit of data and then all those 1000 Shader ALU do their magic (computing intensive, floting point foo, or simply massiv parrallel) on it IN PARRALEL, GPUs are great.
But when you have to shovel around a lot of data (the file data, and bringing it to the GPU may mean the Driver makes some malloc/mmap/memcpy behind your back to fit it to some DMA restrictions (Data has to be aligned at a Page boundery, what not)), and then do something inherent NON PARALLEL (Hashes have the nasty habit that their calculation is serial, earlier bytes influence the calculation in later bytes...), with much basic ops (and/or/xor/shift), GPUs with their 500 Mhz are a loss. On another important crypto op (table lookups (sboxes)), they suck badly (GPU are made for lot of throughput, not latency).
Yes, they have 1000 Shader ALU, but only one Shader can do work on one hash at a time, so you need to do several hashes (lets say 64Kb chunks of different files) at the same time.
Because a lot of Shader run the same Programm at the same time (Workgroups), doing the different Hashes is not the best route. In effect you are left with 64 additional Individual compute units, on a top-of-the-line card, not those integrated/middle class stuff. At this Point it Consumes 160W and makes a lot of Noise.
Maybe someone should write an OpenCL App which tries to take the SHA1 of n files at once and messure it, so we would have the speed for 1, 2, 4, 8 files.
Greetings
Jan