by ailurophobe » 08 Nov 2010 07:01
Right, that seems bad. But it is not really related to my suggestion as a bad hub would not obey any hub-to-hub connection limits anyway, but would simply connect to as many hubs as possible as fast as possible without bothering to report correct neighbour counts or to route real queries correctly. Nothing we can do to help or hurt without a protocol change on that.
The easiest "fix" would probably be to stop forwarding queries to neighbours altogether and simply return QA/D for hubs without QHT hits and QA/S for everyone else. But there is a reason I put the "fix" in quotation marks... It might work if we also did a major change to network topology. Like increasing neighbour counts...
And making QHTs 32 bit, maybe? The extra 12 bits should reduce false hits and current way of encoding and processing QHTs is pretty inefficient for hubs. I mean why deal with a 128kB array and gzip when the actual information contained (uncompressed in any way) would with 2% full be 20 bits per entry (index value in an array of bits) times 10 000 for a total of 200 000 bits or 26 kB and an utterly trivial compression of sending huffman coded offsets from last value sent would halve that. Then you could just use a multimap to store hash key / source (either library entries, hubs, or neighbours) pairs and a map to store hash key / reference count pairs. It would be roughly as compact on network as current gzip compressed system, lookups would be much faster (especially on higher leaf/neighbour counts), and the processing overhead would be trivial. And while the memory footprint would with 2% full 20 bit QHT be roughly the same as now, it would be much lower for nearly empty QHTs (most leaves) and not be affected at all by moving to use a 32 bit QHT.
The above is something I have been thinking for a while. Haven't bothered to mention it or do anything about it since I haven't figured a way to do it sensibly without breaking backwards compatibility on hubs. Mentioned it now since we were discussing protocol "fixes" and somebody here might be smarter than I am. Not like I am any kind of a G2 expert...