Opimize search result persistence

Discuss Shareaza development as a user.
Forum rules
Home | Wiki | Rules

Opimize search result persistence

Postby alvato » 18 Oct 2009 21:11

Hi,

here comes a low priority optimization suggestion for shareaza 2.4.0.0:
I have some search results permanently open with my favorite music bands (which have put their music under public domain).
When i update the search results (perform search, do not clear previous results), shareaza persists the search results very frequently (dont know exaclty when, maybe its like 10 seconds after new result received).
That results in high disk write volume, like 1 gigabyte write IO in 10 minutes.

Would be better if Shareaza would persist less aggresively, that would lower hard disk noise and take less ressources.


Cheers
Alvato
alvato
 
Posts: 18
Joined: 30 Aug 2009 17:22

Re: Opimize search result persistence

Postby alvato » 21 Oct 2009 19:02

Also, shareaza becomes inresponsive for 5 seconds for every 20 seconds when i perform a search without cleaning previous results and alot of results available (~10000 in total for all searches).

Cheers
alvato
 
Posts: 18
Joined: 30 Aug 2009 17:22

Re: Opimize search result persistence

Postby dot45 » 30 Oct 2009 02:39

How much RAM is in your computer? What version of windows are you running?
dot45
 
Posts: 6
Joined: 30 Oct 2009 02:22

Re: Opimize search result persistence

Postby ocexyz » 02 Nov 2009 01:13

Install 2.5.0.0 and let us know if this has helped.
User avatar
ocexyz
 
Posts: 624
Joined: 15 Jun 2009 13:09

Re: Opimize search result persistence

Postby alvato » 09 Nov 2009 17:08

Thanks,

Shareaza 2.5.0.0 behaves like this with pesisting search results:
* UI is much less blocking (about 1/6th of Shareaza 2.4.0.0)
* Hard Disk noise is still there (every 15 seconds 3 seconds write noise)
* Write I/O is still high (~1 GB write / 10 minutes with ~15000 search results)


Cheers
Owe
alvato
 
Posts: 18
Joined: 30 Aug 2009 17:22

Re: Opimize search result persistence

Postby ocexyz » 09 Nov 2009 18:07

I suppose there is sth elese than Shareaza what make this noise every 15s. I don't reminds me Shareaza's behaviour. I would suggest just in every case to run good antivirus software.
User avatar
ocexyz
 
Posts: 624
Joined: 15 Jun 2009 13:09

Re: Opimize search result persistence

Postby raspopov » 12 Nov 2009 04:32

15 000 ?!?! :lol: Please don't rape Shareaza!
User avatar
raspopov
Project Admin
 
Posts: 945
Joined: 13 Jun 2009 12:30

Re: Opimize search result persistence

Postby ivan386 » 12 Nov 2009 14:08

If you need all this results you can save it as HTML page.

In Advansed (Tabbed) mode:
1 select all results.
2 right click
3 Copy URI.
4 Select: Magnet HTML.
5 Save.

Image
Last edited by kevogod on 14 Nov 2009 00:42, edited 1 time in total.
Reason: Removed URL due to inappropriate content
ivan386
 
Posts: 261
Joined: 17 Jun 2009 14:08

Re: Opimize search result persistence

Postby ailurophobe » 12 Nov 2009 17:48

Sorry, but 10 or 15 thousand results really is pretty much "raping Shareaza." Last I checked (which admittedly was few years ago) search and search UI used structures that scale linearly with the number of results. More hits -> slower performance in direct proportion. So with 15 thousand results Shareaza would update the search results roughly thousand times slower than with 15. So adding or updating a result should be expected to take several seconds when thousands of results are present. Also when you are updating an old search then unless you have insane amounts of RAM or haven't done anything serious since the old search it will involve doing lots of paging -> lots of disk IO. Of these the first one could be fixed but probably won't since searches that large put a too heavy load on the network, the paging/disk IO issue probably not. I suggest you try to figure out a way to use smaller searches. I use the web to find out more info about the specific content I am interested first and then do a search for files with that content. In your case, you could add the song name and preferred file extension to the search terms.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 13 Nov 2009 23:46

branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 14 Nov 2009 03:31

True, the serialization was not designed to handle that many hits. Basically it saves it periodically to disk if it has changed since last save. This is what causes the periodic disk access. I doubt it is what causes the performance problem though since the write should go to the cache. Although it presumably blocks the search for the write so it doesn't help, but it shouldn't really hurt large searches that much worse in proportion than small searches. (But 20% slower is more if it is from something slow.) The RAM talk was referring to when you "search again", by then the search has probably been paged out and there should be a perceptible one time delay while it is paged back. I did not mean it is what causes the repeating disk access although that is also possible if you are short on RAM.


"Raping" was not referring to disk access or memory use, just that handling the data structures becomes slower the bigger they are. Iterating thru 10000 items takes hundred times longer than iterating thru 100 items that's all. Maybe the code has been moved to some kind of binary tree, but it used to be a linked list and that really must be iterated thru for each hit.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby raspopov » 14 Nov 2009 07:54

Just don't browse such crazy hosts. Internet has limited bandwidth, you computer has limited RAM, CPU and disk performance. It's a life, you cannot handle infinite amount of information.
User avatar
raspopov
Project Admin
 
Posts: 945
Joined: 13 Jun 2009 12:30

Re: Opimize search result persistence

Postby branko-r » 14 Nov 2009 10:32

branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 14 Nov 2009 15:53

I keep telling you, serialization is not the actual problem here. A search that large is slow even without a disk write. And the write does go to the cache, that is why when the next write happens the previous one is truncated and replaced. And the reasons why the UI freezes are due to the search consuming large amounts of CPU cycles and possibly Shareaza updating the visible list. (I seem to recall somebody optimized it to spend less cycles if the list is not visible... Could be wrong.)

Just noting you are not the person I actually originally answered who had a search problem while you have a browse problem. The problem, updating a long list, is the same, but I was talking about search not browse. But yes, browse has the exact same problem, every item Shareaza receives has to be inserted (it can't be just appended because of possible duplicates, sorry) to the list and this is slower the more items you have. Changing to an indexed data structure would help a lot, but raspopov is actually correct, both large searches and large browses put unreasonable load on the network, so asking developer support for them is unlikely to work. Improvements to this will probably come as part of general optimizations not as specific fixes. Unless of course you find a developer willing to code it.

The serialization is something you have to discuss with raspopov. Basically searches (and browses) used to be serialized only on shutdown and were lost every time Shareaza or Windows crashed. People got fed up with this and raspopov coded a quick fix for it. I don't know why it saves every 15 seconds instead of on completion. He usually does have a reason though.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 14 Nov 2009 19:40

Large searches do put an unreasonable load on the Shareaza network, but - if I'm not mistaken - large browses don't, unless "network" here means "computer that browses and the one that is being browsed". I know that the original poster was talking about the search, not browse, but the root cause appears to be the same.

In my experience, GUI is frozen due to disk writes. The CPU load does not get anywhere near 100%.

Ultimately, the developers decide what (not) to do, but it has to be for the right reasons.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 15 Nov 2009 01:50

You are forgetting that the network is composed of one to one connections. So while a single browse does not seem like a problem, all the browses combined do put a load on the entire network. Actually I think the browse might be throttled at the host to control this. Certainly I don't remember anyone uploading their file list to me at their full speed ever.

I'll concede that it is possible there is a problem with the disk i/o, all I can say for sure is that even without a disk i/o problem it would still be insanely slow due to the other reasons I mentioned, so that you don't need an i/o problem to explain the problems. But it is possible and it certainly would make it even slower. That said browse always was slow even before this auto-save was added. Even with much fewer files. Basically it is very hard for me to know if it is being the normal extremely slow or the abnormal extremely slow. Fortunately, if you see a disk i/o problem you can ask for it to be fixed even without knowing for sure what symptoms it actually causes. A problem is a problem. I doubt raspopov would find it difficult to add a setting to change the auto-save period or even turn it off.

Do you have a multi-core system? Only one thread at a time can work on the list so a multi-core system can be totally CPU bound with every core mostly idle.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 15 Nov 2009 13:48

I didn't say "I want my browse to be finished sooner". It is probably throttled at the source - and it should be, since it's synchronous (no queuing, unlike the file download requests). I said "I want it to not freeze my computer in the process". I also noticed that slightly larger browses (>several thousand files) now usually stop at half point or so, probably because while there's a 15-second round of disk thrashing, the other node thinks I've given up.

I have a plain old single core A64-3200. I don't have any significant performance-wise complaints running Shareaza on this system except this one. In fact, it was my #1 gripe with 2.4.0.0, but with 2.5.0.0 it has fallen to #2 (which is a subject for a separate thread).
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 15 Nov 2009 20:10

I'd guess that is a locking problem. (I confess, I misunderstood what your problem is.) While it is saving the data or adding items nothing else behind the same lock(s) happens and since saving or adding data takes longer on a long list the system spends longer and longer waiting for something else to happen. And some of these locks may be at the Windows or support DLLs so other applications get affected by operations that shouldn't. Like disk writes. Other than asking a dev for a fix my best suggestion at this point would be to check that Shareaza is not installed on a drive with write caching disabled. And not get too set on the serialization being the culprit, UI operations can also freeze other applications in Windows.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby ailurophobe » 16 Nov 2009 21:17

ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby raspopov » 17 Nov 2009 05:05

branko-r, I think you need to check your HDD for errors (check SMART also), update you HDD/motherboard drivers and install all service packs for windows. Commonly saving of such small amount of data even every 15 seconds (in facts there is 30 second delay in code) cannot cause computer stuck up. 8-)
User avatar
raspopov
Project Admin
 
Posts: 945
Joined: 13 Jun 2009 12:30

Re: Opimize search result persistence

Postby ailurophobe » 17 Nov 2009 14:02

ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 17 Nov 2009 19:37

My system is patched, SMART-monitored, and the drivers are as new as they get, since I'm using Windows 2000. That's why I unfortunately can't use ailurophobe's advice with fsutil. But still: I don't have disk performance problems with any of the applications I'm running, so it doesn't appear to me that there's something wrong.

Turning on write caching has downsides too.

Raspopov, you have a point, this problem doesn't make sense to me either. But: are you saying that you've actually tried to reproduce this with a c. 50-60 Mb BrowseHosts.dat, and that everything worked fine?
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 17 Nov 2009 20:51

You already have write caching unless you are using an external drive configured for safe removal. You are right fsutil is for XP and later, sorry. And like I said, I do not believe, just like both of you apparently, that the actual data writes can cause this because they should all go to the cache and shouldn't realistically affect other applications unless critically short of RAM. So having the system bottleneck at the NTFS journaling which is shared by all applications using the drive is the only explanation I can see on the file system side. This might also be a GUI problem, but I think Shareaza no longer builds the UI structures if the relevant window is not displayed. raspopov would know, I think.

Do you have a way to make Shareaza use a FAT (any version) formatted volume for the browsehosts.dat? I think there are tools for creating a new volume from the free space of the existing one without data loss, if you don't have a non-NTFS volume.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby raspopov » 18 Nov 2009 05:16

Modern HDD can save 50MB file in less than a second. Check your system man. :geek: Defragment drive, remove viruses etc.
User avatar
raspopov
Project Admin
 
Posts: 945
Joined: 13 Jun 2009 12:30

Re: Opimize search result persistence

Postby ailurophobe » 18 Nov 2009 17:34

Speaking of viruses... Your virus scanner might be causing this if it traps all file modifications.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 19 Nov 2009 17:09

True, a modern HD can easily save 50 Mb in a couple of seconds. But a lot of small writes...

Actually I've just downloaded the source and taken a look. I must admit I had a suspicion that Shareaza is doing something like 10000 file appends - which would indeed cause disk thrashing and awful performance - but that would be stupid. I could not find anything suspect in the way classes are serialized - on the contrary, the code is exemplary (and also very neat, BTW). But what about stream-level buffer size? (I.e. how often is data flushed to disk?) I couldn't tell from the source. Can this affect performance? IIRC BrowseHosts.dat gets highly fragmented.

I have a "disk LED" system tray icon - it's red through and through. No disk reads, only disk writes. Does not look like antivirus. No problems with other applications whatsoever.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 19 Nov 2009 17:53

The serialization is apparently still done by MFC CArchive class, so looking at Shareaza code won't help much. Try using Process Monitor, it should be able to spot what is causing your problem. It should run on Win 2k sp 4, references to XP sp 2 are probably about the live version.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby ailurophobe » 19 Nov 2009 20:16

If it is with serialization, I think I found it. The CArchive::CArchive actually has two optional parameters the first of which is "int nBufSize = 4096" and explained as "An integer that specifies the size of the internal file buffer, in bytes. Note that the default buffer size is 4,096 bytes. If you routinely archive large objects, you will improve performance if you use a larger buffer size that is a multiple of the file buffer size." Several times a minute certainly qualifies as "routinely" and 50MB qualifies as large in comparison to 4kB. No idea if this is actually related to the problem branko-r has, but rather obviously CArchive expects you to set this buffer to larger size when doing things like auto-saving very large lists and will get better performance for it.

Could somebody go through the code and see that CArchives used to serialize structures that could potentially be very large have larger than default buffers? Even if it has no impact on the issues in this thread, having them at default will have impact on performance at start and close.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 20 Nov 2009 17:30

I know very little about the MFC but CArchive looks like a (light) wrapper around the plain old FILE. I looked at the online docs yesterday and came to the same conclusion: the default 4k buffer is used. I'd suggest 32k.

Unfortunately Process Monitor gave me a blue screen. I tried with the Win2k performance monitor, but the only thing I could see is unusually large number of disk writes during serialization, which is more or less what I already knew.

Setting the CArchive buffer to at least 32k throughout the app wouldn't hurt.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ocexyz » 20 Nov 2009 21:25

User avatar
ocexyz
 
Posts: 624
Joined: 15 Jun 2009 13:09

Re: Opimize search result persistence

Postby ailurophobe » 20 Nov 2009 22:11

Testing this in debug builds should be perfectly safe, since this is a parameter developers are supposed to optimize on a per serialization basis. Worst that can realistically happen is that you increase memory consumption by some kilobytes. I think the easiest way to test would be to add a new advanced setting that defaults to 0 and add all (or just potentially large ones such as security, search, and browse) CArchive calls the extra parameter set to 4096 multiplied by the new setting plus one. That way it would default to behaving exactly as before, but branko-r and anyone else would be able to see what effect changing the buffer size has, if it does, without needing to reboot. Good things to test would be large searches and browses and closing with a large security list. Loading large security lists might also get faster. It might also be a good idea to check if this has effect on running disk intensive applications while doing the above things.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 21 Nov 2009 10:06

No harm can be done by increasing the buffer size - only diminishing returns past some point.
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby raspopov » 21 Nov 2009 16:59

User avatar
raspopov
Project Admin
 
Posts: 945
Joined: 13 Jun 2009 12:30

Re: Opimize search result persistence

Postby branko-r » 21 Nov 2009 21:21

branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia

Re: Opimize search result persistence

Postby ailurophobe » 24 Nov 2009 19:14

I think older versions of Process Monitor supported 2k and the old link I followed said it is supported.
ailurophobe
 
Posts: 709
Joined: 11 Nov 2009 05:25

Re: Opimize search result persistence

Postby branko-r » 05 Dec 2009 20:27

Let me just note that the problem is gone in 2.5.1.0. Thanks once again!
branko-r
 
Posts: 44
Joined: 08 Jul 2009 09:47
Location: Zagreb, Croatia


Return to Bugs, Tasks, and Features Discussion

Who is online

Users browsing this forum: No registered users and 0 guests