Regexp filter doesn't work.

Get answers to your Shareaza related problems.
Forum rules
Home | Wiki | Rules

Regexp filter doesn't work.

Postby tharrison1 » 11 Sep 2010 08:16

I did an audio search and got a lot of obviously bogus results of the form

"01artist firstwordoftitle-192kb.mp3"

with no metadata. (I used only the artist and the first word of the title in the search. Any legitimate result should have had the full title and there shouldn't have been dozens of them with slightly different file sizes.)

So I add a security rule:

^(01).*(-192kb)$

patterned after some of the pre-existing ones.

Shareaza hangs for what seems like half an hour and then spontaneously recovers, after which a) the new rule is in the security tab and the bogus results are not being filtered.

OK, maybe the hyphen needs to be escaped.

Right click, edit rule ... eh? Where's "edit rule" in the right click menu? It's missing! But there's a button way at the bottom of the screen. I'll use that instead.

Click it, change the rule to this:

^(01).*(\-192kb)$

and hit OK. Shareaza promptly locks up again. Does it do this EVERY TIME you do ANYTHING to a security rule? That makes modifying the security rules pretty much ergonomically unusable, even if it weren't the case that the general population wouldn't know a regexp from a rectilinear grid.

After Shareaza eventually recovers again (I timed it more precisely this time: about 10 minutes, and yes it saturated the CPU for a grand total of 1.5 TRILLION processor cycles blown on God alone knows what -- even extensive use of bubble sort can't explain that kind of ridiculous cycle crunchage) the bogus results are STILL not filtered.

That's at least half an hour of my life wasted on trying to use a feature that's 1. unusable by non-technical users, 2. unusable by anyone who doesn't have a spare hour or ten to kill if there's a lot of rules they want to add, and 3. doesn't frelling work anyway. Thanks a lot.

Any suggestions? Is there ANY easier way to filter out all this crud? It pollutes every audio search, often to the point that Shareaza stops searching quite quickly and before finding what I'm actually looking for.
tharrison1
 
Posts: 79
Joined: 04 Sep 2010 22:47

Re: Regexp filter doesn't work.

Postby branko-r » 11 Sep 2010 18:50

Unfortunately, adding rules is dog slow. This issue has been raised a number of times.

I don't like to criticize the developers, because they are doing a great job for very little in return. I also don't like to criticize the application because, being a developer myself, I'm fully aware that in real life resources are always limited, hence software is not perfect.

However... What I really dislike is defensive attitude and denial. There's something wrong with the rule evaluation, period. One cannot just say that it's slow because there's "a lot of search results to go through, and it can't be faster". I don't believe that my PC, capable of executing billions of instructions per second, needs 4 or 5 seconds to evaluate a simple IP rule across a couple of thousand search results.

Maybe this is not easy to fix, but it should be easy to at least acknowledge. Personally, I'd be satisfied with that.

My apologies if this sounded negativistic.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Regexp filter doesn't work.

Postby old_death » 11 Sep 2010 21:32

The problem is that Shareaza has very suboptimal search algorithms in place (some linear search), which is the same reason for the search filtering on file searches to be so damn slow. The problem is, right now, none is willing to fix this because of the big amount of necessary time for the task...
User avatar
old_death
 
Posts: 1950
Joined: 13 Jun 2009 16:19

Re: Regexp filter doesn't work.

Postby tharrison1 » 12 Sep 2010 19:59

tharrison1
 
Posts: 79
Joined: 04 Sep 2010 22:47

Re: Regexp filter doesn't work.

Postby tharrison1 » 12 Sep 2010 20:22

tharrison1
 
Posts: 79
Joined: 04 Sep 2010 22:47

Re: Regexp filter doesn't work.

Postby branko-r » 13 Sep 2010 19:59

This is a very good analysis of the problem.

"4-5 seconds" was my actual situation. I was using simple IP matching rules, hence the "speed". Prompted by your post, I deleted most of my rules since they were obsolete anyway. Now adding rules is much faster (possibly 10x), which suggests that Shareaza employs a "round-robin" algorithm (evaluate all rules across all search entries) - or worse - for each change. By storing IP rules in a binary tree, one might obtain a logarithmic dependence on the number of rules. Storing them in a hash table is better. Evaluating only the rules that were actually changed is even better than that.

I agree that even if one implements it in the worst possible way, it still can't be that slow. Regexps are an order of magnitude slower than IP range matching, and e.g. Perl-compatible REs are slower still, but 10 minutes?!

Let me also tell you this: the developers are not incompetent. I've seen the source code, and it's not something that looks like it was coded by incompetent people. I know how things usually go: they've kept it simple, it works fine for the easy cases.

Actually I might take a look at the source again. I don't believe it's something that can't be solved within an hour of coding.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Regexp filter doesn't work.

Postby old_death » 14 Sep 2010 21:45

Hey, I think many people (including me) would thank you if you were to submit a patch for this problem.
User avatar
old_death
 
Posts: 1950
Joined: 13 Jun 2009 16:19


Return to Help and Support

Who is online

Users browsing this forum: No registered users and 1 guest