Page 1 of 2

Does this actually work?

PostPosted: 01 Mar 2011 09:51
by ailurophobe
I got curious if it would be possible to boost security performance easy and transparent by adding a miss cache that records addresses that were looked up and not found on the security list. I added some code that I hope does that, compiled it, ran... No errors, no crashes, everything seems to work, but... How can I tell if the code works? I mean it is supposed to make Shareaza "do nothing faster" by saving a useless iterations of the security list. I mean if it doesn't work Shareaza works as it always did, if it does work Shareaza works as it always did but a little bit faster... How exactly can I tell the difference? Please help.

PS. Getting this to work would do wonders to UDP performance... Now every UDP packet must be checked separately (TCP only checks on connect), with a miss cache the address would be checked once and if no hit was found further checks would happen from the cache.

Re: Does this actually work?

PostPosted: 01 Mar 2011 22:32
by old_death
You're right about that it would do wonders if it were to work. However, more work is certainly required at a later time... Efficiency could be greatly improved by separating the filters and optimizing each of them for the type of content they have to deal with...

Re: Does this actually work?

PostPosted: 02 Mar 2011 02:48
by old_death
I have started rewriting parts of the security filters. My version (once finished) will threat single IP bans differently than the rest (those are what most rules are and those are what can be improved on most).

As this may take a day or two, I'd like to request to commit no changes on the Security.h and Security.cpp files until I have completed my work. ;)


mfg,
Old

PS.: I will most probably require help on testing as I have never debugged code that complex before (not the filtering code itself, but Shareaza around it is what is bothering me :) )

This does this actually work.

PostPosted: 02 Mar 2011 05:42
by skinvista

Re: Does this actually work?

PostPosted: 02 Mar 2011 08:00
by old_death
So... This is what I have produced in a night of coding.

It certainly needs a lot of testing (and most probably also some bug fixing). :mrgreen:

It adds a map for single IP blocking rules which should be a bit faster than what we do now... As for the miss cache, I have kept that code, however, I do not know whether it would be more efficient to remove it again, now that the code can look up the blocked IPs much faster...


And now I am going to bed. Comments while I am asleep (and afterwards) are welcome. :D


mfg,
Old


PS.: I wasn't sure how this method that fixed host masks worked, so it might not be necessary to include it everywhere I did. Also, I wasn't sure whether extracting all IPs from an IP range ban and adding them to the map or processing the IP group bans in linear order is more efficient... For now, they are treated as they always have been.

Re: Does this actually work?

PostPosted: 02 Mar 2011 13:51
by cyko_01
committed in rev. 8920

Re: Does this actually work?

PostPosted: 02 Mar 2011 15:30
by old_death
Wow cool! Thanks Cyko. :mrgreen:

The next thing I will do is continue optimizations by implementing a similar special treatment for blocked hashes. Also, I will continue my previous work by creating several small security rule classes that will replace the current CSecureRule class by inheriting from it and splitting its functionality in a logical way.

BTW, comments are still welcome, as this is my first contribution to something as complex as Shareaza.

Re: Does this actually work?

PostPosted: 02 Mar 2011 19:17
by cyko_01

Re: Does this actually work?

PostPosted: 02 Mar 2011 20:01
by old_death
I decided that this is not the final form. I will change it to make it to operate like this:

1 list will contain all rules. This makes it possible to iterate threw that list and to update for example the security window in an easy fashion.
2 maps will allow easy (and fast) access to all rules for single IPs and hashes.
1 list will contain all rules that do filter IP ranges (as iterating threw such a list is faster than adding all IPs of these rules to the previously named maps and filling them up very fast...
1 list will contain everything that does not match into the previously named structures. This allows to treat all rules for content filters completely separated from the filters that can be dealt with in a fast way.

Re: Does this actually work?

PostPosted: 02 Mar 2011 22:47
by ailurophobe
If you want to do the work I won't complain but... You know that miss performance dominates the performance of any type filter so that you'd probably get almost the same performance by having miss caches for addresses and hashes for fraction of the work? (And a special rule type for hash only filters that can be skipped if hash hits are not possible.)

@skinvista:

Thanks, that's really all I needed to know. It works as it should.

Re: Does this actually work?

PostPosted: 03 Mar 2011 00:00
by old_death

Re: Does this actually work?

PostPosted: 03 Mar 2011 00:36
by old_death
I wonder if anybody could pass me a security filter XML file that does contain all different rule types (including filters by hash, regex, keyword (all/any), IP & IP range)?

Re: Does this actually work?

PostPosted: 03 Mar 2011 05:08
by ailurophobe
Like I said I won't complain if you do the work. :D

Miss caches are good enough if you only want the performance boost, which is why after lots of dreaming about Boost::MultiIndexes I finally went ahead with one. But if you are willing to do the extra work separating the different rule types and optimizing the lookups for the each type does improve the performance and allow improving the security manager UI separating it to have a separate panel for each type and that's where it starts getting worth the trouble. But too much work for me, so I just took the low hanging fruit. A cache doesn't really need to synchronize with the actual rules, you just clear it if it might have an error in it... So much easier. :)

I think we should go with the ODs code unless we need a stable release before he can get it ready and properly tested. In that case my miss cache approach could be used instead. All it misses (it wasn't really ready for commit you know) is adding a limit on the cache size to prevent it from growing big enough to cause problems and miss caches for SHA1 and ED2K hashes. All of which someone like ryo could do in few minutes I am sure.

EDIT: And the cache and the rules really should have separate locks so that you can miss fast even when checking the list for something else. I didn't see the point doing that for the test, but not having connections lag because of a sanity check really is kind of the whole point.

Re: Does this actually work?

PostPosted: 03 Mar 2011 16:20
by raspopov
cyko_01, don't commit unfinished patches again, please! I spent a half of day fixing broken Security window and bugs in new code.

Re: Does this actually work?

PostPosted: 03 Mar 2011 21:30
by old_death

Re: Does this actually work?

PostPosted: 04 Mar 2011 04:51
by cyko_01

Re: Does this actually work?

PostPosted: 05 Mar 2011 22:16
by cyko_01

Re: Does this actually work?

PostPosted: 06 Mar 2011 22:17
by old_death
I think, your proposal will bloat the size of the xml file a bit too much without really bringing additional benefit...

I will finish my rewrite of the security code somewhere next week. As it will include some slight changes to the xml format, too, maybe my solution should be taken into consideration as an alternative before starting the related discussion at this point.
However no matter how the format changes (if it changes at all), I think we should present users a way to convert from one format into the other to provide security filter support for older versions of Shareaza.

In any way, changing the xml output/input in the way you proposed should be doable within less than 1 hour of work, I think.

Re: Does this actually work?

PostPosted: 07 Mar 2011 02:22
by cyko_01

Re: Does this actually work?

PostPosted: 12 Mar 2011 00:18
by old_death

Re: Does this actually work?

PostPosted: 12 Mar 2011 16:33
by cyko_01
don't forget to fix these bugs if it has not already been done :D :
https://sourceforge.net/apps/phpbb/shareaza/viewtopic.php?f=5&t=903

Re: Does this actually work?

PostPosted: 13 Mar 2011 15:17
by old_death
Oh, I had forgotten those completely. :) Seems that my work will take a bit longer than I originally thought as under these circumstances, I have to also find the places where to call the security filters within the code.

However it would be cool if somebody else could do the searching on where to call the security filter, as I simply don't know enough about Shareaza to complete this task in a timely manner.

Re: Does this actually work?

PostPosted: 22 Mar 2011 23:40
by old_death
There you go.

Still testing required - should compile though.

Still lacks country blocking rules (which I will implement and upload once the already existing part has been checked, approved and committed) and GUI integration for the new features, as I am not good enough in the MFC based GUI stuff to implement everything that's needed.

Necessary are mainly changes to the rule editing dialog, though, so it shouldn't be a really big task. ;)


Note plz that this code needs to be tested and reviewed before it is uploaded to the repository as there have been too many changes to commit this without testing it first (which I can't as I can't compile Shareaza completely; Linker issue).

Re: Does this actually work?

PostPosted: 23 Mar 2011 18:52
by old_death
Is there nobody willing to at least compile a build with this code to be able to test it?

Re: Does this actually work?

PostPosted: 23 Mar 2011 20:02
by old_death
Thanks to cedric for helping me fix some warnings, a linker error I couldn't get rid of and some minor problem that seems to have been born while merging the code with the most recent version of the repository.

Re: Does this actually work?

PostPosted: 27 Mar 2011 22:16
by old_death
Well, as nobody seems to be willing to test it, I'd like to request this code to be committed into the official svn repository. Here are the necessary files (merged with the latest changes as of today):

security_2011-03-27.7z
(140.07 KiB) Downloaded 408 times

mfg,
Old


PS.: The archive now also includes an update to the English translation file I had forgotten previously.

PPS.: Yes, indeed, this is the final version.

Re: Does this actually work?

PostPosted: 29 Mar 2011 23:14
by old_death
Nobody willing to commit...?

Re: Does this actually work?

PostPosted: 30 Mar 2011 03:46
by raspopov
Too many "TODO:" in the code, and why you removed "Complain"-function?

Re: Does this actually work?

PostPosted: 30 Mar 2011 11:01
by ailurophobe
You don't actually need both a miss cache and an address map.

The way I was thinking about handling the address rules before I opted for the miss cache, I would have changed to the x.x.x.x/n format and used an array of 32 IP to rule multi-maps one for each possible value of n. Rules would then be added to the map corresponding to their n (sorry don't remember the correct word but it is the size of the range in masked bits), and a look up would happen by iterating thru the maps (most of which would be empty and could be skipped). All applicable ranges would be "hit" but the result returned would be the one of the most specific (smallest) range. This is the only way I have figured out so far that handles overlapping address ranges correctly. (Hits all rules and returns the most specific result.)

Re: Does this actually work?

PostPosted: 30 Mar 2011 17:26
by raspopov
Complain is not an address map its a "lite ban" i.e. ban applied only after several violations. Primary used for hammering detection.

Re: Does this actually work?

PostPosted: 31 Mar 2011 10:15
by ailurophobe
Oops, was commenting on ODs code, not your post, my bad. Although I was curious about what the complains actually do... (Saw them in the code but had no reason to check what they do.)

Re: Does this actually work?

PostPosted: 31 Mar 2011 13:37
by old_death
Ryo, most of these todos could actually be removed (seems that I had forgotten some during my cleanup). The only ones that remain is the one to place the constant for the rule expiry interval into the settings and the 4 todos with a question mark (should the expiry check be called on a different place?). Also, the todo about the livelist maintenance shall stay there (it might be more efficient if we were to maintain the livelist along with the storage containers for the other rules instead of generating a fresh copy of it each time it is requested).

As for the complain function, there have been only 2 calls to it. One of them could be easily replaced with a "nowmal" ban and the other one of those used a 10s validity period (or something smaller, I don't remember), where the chance of a second hit within that time was so small that I thought we could drop the complains alltogether... However I have to apologize if my assumptions weren't correct at this place...

I have attached a new copy of the security.h and security.cpp files with the useless todos removed:

Re: Does this actually work?

PostPosted: 01 Apr 2011 02:02
by old_death
Ryo, if that is what you want, I can remove the comments around that method and the related content. Just say so, OK? My opinion is, however, that this method is currently not needed...

Re: Does this actually work?

PostPosted: 03 Apr 2011 01:15
by old_death
OK, I managed to execute my code by now and I am testing it right now. Meanwhile, I have detected some more spots that do require optimizations, so it will probably take a while until this is as it is supposed to... :mrgreen: (I have detected cases where the amount of required work can be reduced by than 50%.)

Plz do not commit my code until I have finished that work. ;)

Re: Does this actually work?

PostPosted: 03 Apr 2011 07:42
by ailurophobe
Great news! To be honest, I took a quick look at your previous code and I wouldn't have committed it, nothing specific ( it was a very quick look), it just somehow looked "not ready". If that makes any sense...

Re: Does this actually work?

PostPosted: 05 Apr 2011 14:04
by old_death
Still testing and improving the code...

(I just wanted to let you know that I am still working on the code.)

Re: Does this actually work?

PostPosted: 07 Apr 2011 15:54
by old_death

Re: Does this actually work?

PostPosted: 07 Apr 2011 17:40
by ailurophobe
Will need lots of clean up before ryo will commit it, I randomly guess. For example you have separate "if it is an address rule" clauses for the miss cache and your own address maps, these should be merged so that all code handling the same case is in the same place. Also I am not sure if you misunderstood what the miss cache was supposed to do or if you just changed it into something entirely different but kept the name. Probably both? ("Fixed" it until it did what you expected it to do?) Anyway, the "miss" did not mean "not blocked", it meant "not in the security list" and this was because a) no need to synchronize for rule expiry or removal b) allow rules still need to be looked up for the hit counters and the cache can't do that. So it is not "return false" it is "return m_bDenyPolicy". I think you "fixed" it so it works anyway, but that required adding extra code. More to the point a map is about as fast as the set used for the cache so there is no real point having the cache if you use maps for IP look-ups.

Also what are the changes in other files? This would have been much easier commit if you just boosted the security manager performance first and did changes to other parts of Shareaza separately. (Two smaller commits instead of one complex one.)

As always, I probably have no idea what I am talking about. Just thought you'd appreciate having some feedback to prove somebody actually looked at your code...

Re: Does this actually work?

PostPosted: 07 Apr 2011 18:11
by raspopov
Im strictly against "country blocking rules".

Re: Does this actually work?

PostPosted: 07 Apr 2011 19:10
by old_death