Page 1 of 2
		
			
				Does this actually work?
				
Posted: 
01 Mar 2011 09:51by ailurophobe
				I got curious if it would be possible to boost security performance easy and transparent by adding a miss cache that records addresses that were looked up and not found on the security list. I added some code that I hope does that, compiled it, ran... No errors, no crashes, everything seems to work, but... How can I tell if the code works? I mean it is supposed to make Shareaza "do nothing faster" by saving a useless iterations of the security list. I mean if it doesn't work Shareaza works as it always did, if it does work Shareaza works as it always did but a little bit faster... How exactly can I tell the difference? Please help.
PS. Getting this to work would do wonders to UDP performance... Now every UDP packet must be checked separately (TCP only checks on connect), with a miss cache the address would be checked once and if no hit was found further checks would happen from the cache.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
01 Mar 2011 22:32by old_death
				You're right about that it would do wonders if it were to work. However, more work is certainly required at a later time... Efficiency could be greatly improved by separating the filters and optimizing each of them for the type of content they have to deal with...
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 02:48by old_death
				I have started rewriting parts of the security filters. My version (once finished) will threat single IP bans differently than the rest (those are what most rules are and those are what can be improved on most).
As this may take a day or two, I'd like to request to commit no changes on the Security.h and Security.cpp files until I have completed my work. 

mfg,
Old
PS.: I will most probably require help on testing as I have never debugged code that complex before (not the filtering code itself, but Shareaza around it is what is bothering me 

 )
 
			
		
			
				This does this actually work.
				
Posted: 
02 Mar 2011 05:42by skinvista
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 08:00by old_death
				So... This is what I have produced in a night of coding.
It certainly needs a lot of testing (and most probably also some bug fixing).  
 
 It adds a map for single IP blocking rules which should be a bit faster than what we do now... As for the miss cache, I have kept that code, however, I do not know whether it would be more efficient to remove it again, now that the code can look up the blocked IPs much faster...
And now I am going to bed. Comments while I am asleep (and afterwards) are welcome. 

mfg,
Old
PS.: I wasn't sure how this method that fixed host masks worked, so it might not be necessary to include it everywhere I did. Also, I wasn't sure whether extracting all IPs from an IP range ban and adding them to the map or processing the IP group bans in linear order is more efficient... For now, they are treated as they always have been.
 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 13:51by cyko_01
				committed in rev. 8920
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 15:30by old_death
				Wow cool! Thanks Cyko.  
 
 The next thing I will do is continue optimizations by implementing a similar special treatment for blocked hashes. Also, I will continue my previous work by creating several small security rule classes that will replace the current CSecureRule class by inheriting from it and splitting its functionality in a logical way.
BTW, comments are still welcome, as this is my first contribution to something as complex as Shareaza.
 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 19:17by cyko_01
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 20:01by old_death
				I decided that this is not the final form. I will change it to make it to operate like this:
1 list will contain all rules. This makes it possible to iterate threw that list and to update for example the security window in an easy fashion.
2 maps will allow easy (and fast) access to all rules for single IPs and hashes.
1 list will contain all rules that do filter IP ranges (as iterating threw such a list is faster than adding all IPs of these rules to the previously named maps and filling them up very fast...
1 list will contain everything that does not match into the previously named structures. This allows to treat all rules for content filters completely separated from the filters that can be dealt with in a fast way.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
02 Mar 2011 22:47by ailurophobe
				If you want to do the work I won't complain but... You know that miss performance dominates the performance of any type filter so that you'd probably get almost the same performance by having miss caches for addresses and hashes for fraction of the work? (And a special rule type for hash only filters that can be skipped if hash hits are not possible.)
@skinvista:
Thanks, that's really all I needed to know. It works as it should.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Mar 2011 00:00by old_death
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Mar 2011 00:36by old_death
				I wonder if anybody could pass me a security filter XML file that does contain all different rule types (including filters by hash, regex, keyword (all/any), IP & IP range)?
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Mar 2011 05:08by ailurophobe
				Like I said I won't complain if you do the work.  
 
 Miss caches are good enough if you only want the performance boost, which is why after lots of dreaming about Boost::MultiIndexes I finally went ahead with one. But if you are willing to do the extra work separating the different rule types and optimizing the lookups for the each type 
does improve the performance 
and allow improving the security manager UI separating it to have a separate panel for each type and that's where it starts getting worth the trouble. But too much work for me, so I just took the low hanging fruit. A cache doesn't really need to synchronize with the actual rules, you just clear it if it might have an error in it... So much easier.  

I think we should go with the ODs code unless we need a stable release before he can get it ready and properly tested. In that case my miss cache approach could be used instead. All it misses (it wasn't really ready for commit you know) is adding a limit on the cache size to prevent it from growing big enough to cause problems and miss caches for SHA1 and ED2K hashes. All of which someone like ryo could do in few minutes I am sure.
EDIT: And the cache and the rules really should have separate locks so that you can miss fast even when checking the list for something else. I didn't see the point doing that for the test, but not having connections lag because of a sanity check really is kind of the whole point.
 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Mar 2011 16:20by raspopov
				cyko_01, don't commit unfinished patches again, please! I spent a half of day fixing broken Security window and bugs in new code.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Mar 2011 21:30by old_death
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
04 Mar 2011 04:51by cyko_01
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
05 Mar 2011 22:16by cyko_01
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
06 Mar 2011 22:17by old_death
				I think, your proposal will bloat the size of the xml file a bit too much without really bringing additional  benefit...
I will finish my rewrite of the security code somewhere next week. As it will include some slight changes to the xml format, too, maybe my solution should be taken into consideration as an alternative before starting the related discussion at this point. 
However no matter how the format changes (if it changes at all), I think we should present users a way to convert from one format into the other to provide security filter support for older versions of Shareaza.
In any way, changing the xml output/input in the way you proposed should be doable within less than 1 hour of work, I think.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
07 Mar 2011 02:22by cyko_01
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
12 Mar 2011 00:18by old_death
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
12 Mar 2011 16:33by cyko_01
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
13 Mar 2011 15:17by old_death
				Oh, I had forgotten those completely. 

 Seems that my work will take a bit longer than I originally thought as under these circumstances, I have to also find the places where to call the security filters within the code.
However it would be cool if somebody else could do the searching on where to call the security filter, as I simply don't know enough about Shareaza to complete this task in a timely manner.
 
			
		
			
				Re: Does this actually work?
				
Posted: 
22 Mar 2011 23:40by old_death
				There you go.
Still testing required - should compile though.
Still lacks country blocking rules (which I will implement and upload once the already existing part has been checked, approved and committed) and GUI integration for the new features, as I am not good enough in the MFC based GUI stuff to implement everything that's needed.
Necessary are mainly changes to the rule editing dialog, though, so it shouldn't be a really big task. 

Note plz that this code needs to be tested and reviewed before it is uploaded to the repository as there have been too many changes to commit this without testing it first (which I can't as I can't compile Shareaza completely; Linker issue).
 
			
		
			
				Re: Does this actually work?
				
Posted: 
23 Mar 2011 18:52by old_death
				Is there nobody willing to at least compile a build with this code to be able to test it?
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
23 Mar 2011 20:02by old_death
				Thanks to cedric for helping me fix some warnings, a linker error I couldn't get rid of and some minor problem that seems to have been born while merging the code with the most recent version of the repository.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
27 Mar 2011 22:16by old_death
				Well, as nobody seems to be willing to test it, I'd like to request this code to be committed into the official svn repository. Here are the necessary files (merged with the latest changes as of today):
mfg,
Old
PS.: The archive now also includes an update to the English translation file I had forgotten previously.
PPS.: Yes, indeed, 
this is the final version.
 
			
		
			
				Re: Does this actually work?
				
Posted: 
29 Mar 2011 23:14by old_death
				Nobody willing to commit...?
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
30 Mar 2011 03:46by raspopov
				Too many "TODO:" in the code, and why you removed "Complain"-function?
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
30 Mar 2011 11:01by ailurophobe
				You don't actually need both a miss cache and an address map.
The way I was thinking about handling the address rules before I opted for the miss cache, I would have changed to the x.x.x.x/n format and used an array of 32 IP to rule multi-maps one for each possible value of n. Rules would then be added to the map corresponding to their n (sorry don't remember the correct word but it is the size of the range in masked bits), and a look up would happen by iterating thru the maps (most of which would be empty and could be skipped). All applicable ranges would be "hit" but the result returned would be the one of the most specific (smallest) range. This is the only way I have figured out so far that handles overlapping address ranges correctly. (Hits all rules and returns the most specific result.)
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
30 Mar 2011 17:26by raspopov
				Complain is not an address map its a "lite ban" i.e. ban applied only after several violations. Primary used for hammering detection.
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
31 Mar 2011 10:15by ailurophobe
				Oops, was commenting on ODs code, not your post, my bad. Although I was curious about what the complains actually do... (Saw them in the code but had no reason to check what they do.)
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
31 Mar 2011 13:37by old_death
				Ryo, most of these todos could actually be removed (seems that I had forgotten some during my cleanup). The only ones that remain is the one to place the constant for the rule expiry interval into the settings and the 4 todos with a question mark (should the expiry check be called on a different place?). Also, the todo about the livelist maintenance shall stay there (it might be more efficient if we were to maintain the livelist along with the storage containers for the other rules instead of generating a fresh copy of it each time it is requested).
As for the complain function, there have been only 2 calls to it. One of them could be easily replaced with a "nowmal" ban and the other one of those used a 10s validity period (or something smaller, I don't remember), where the chance of a second hit within that time was so small that I thought we could drop the complains alltogether... However I have to apologize if my assumptions weren't correct at this place...
I have attached a new copy of the security.h and security.cpp files with the useless todos removed:
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
01 Apr 2011 02:02by old_death
				Ryo, if that is what you want, I can remove the comments around that method and the related content. Just say so, OK? My opinion is, however, that this method is currently not needed...
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Apr 2011 01:15by old_death
				OK, I managed to execute my code by now and I am testing it right now. Meanwhile, I have detected some more spots that do require optimizations, so it will probably take a while until this is as it is supposed to...  

 (I have detected cases where the amount of required work can be reduced by than 50%.)
Plz do not commit my code until I have finished that work.  

 
			
		
			
				Re: Does this actually work?
				
Posted: 
03 Apr 2011 07:42by ailurophobe
				Great news! To be honest, I took a quick look at your previous code and I wouldn't have committed it, nothing specific ( it was a very quick look), it just somehow looked "not ready". If that makes any sense...
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
05 Apr 2011 14:04by old_death
				Still testing and improving the code...
(I just wanted to let you know that I am still working on the code.)
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
07 Apr 2011 15:54by old_death
				
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
07 Apr 2011 17:40by ailurophobe
				Will need lots of clean up before ryo will commit it, I randomly guess. For example you have separate "if it is an address rule" clauses for the miss cache and your own address maps, these should be merged so that all code handling the same case is in the same place. Also I am not sure if you misunderstood what the miss cache was supposed to do or if you just changed it into something entirely different but kept the name. Probably both? ("Fixed" it until it did what you expected it to do?) Anyway, the "miss" did not mean "not blocked", it meant "not in the security list" and this was because a) no need to synchronize for rule expiry or removal b) allow rules still need to be looked up for the hit counters and the cache can't do that. So it is not "return false" it is "return m_bDenyPolicy". I think you "fixed" it so it works anyway, but that required adding extra code. More to the point a map is about as fast as the set used for the cache so there is no real point having the cache if you use maps for IP look-ups.
Also what are the changes in other files? This would have been much easier commit if you just boosted the security manager performance first and did changes to other parts of Shareaza separately. (Two smaller commits instead of one complex one.)
As always, I probably have no idea what I am talking about. Just thought you'd appreciate having some feedback to prove somebody actually looked at your code...
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
07 Apr 2011 18:11by raspopov
				Im strictly against "country blocking rules".
			 
			
		
			
				Re: Does this actually work?
				
Posted: 
07 Apr 2011 19:10by old_death