Page 1 of 1

Quadratic large-library slowdown can be improved

PostPosted: 30 Jan 2012 02:50
by cgreevey

Re: Quadratic large-library slowdown can be improved

PostPosted: 06 Feb 2012 18:37
by old_death
We're using SQL for our file database, so such behavior would be something to be addressed best to the developers of the SQL solution (sqlite) we're currently using, however I really doubt this quadratic behavior your're suggesting... I do have a quite big library myself and so when rehashing, I should have encountered the same - especially as I did pay attention to the speed files were added at the time.

mfg,
Old

PS.: I've got no clue about who's bee removing your posts.

[EDIT]
PPS.: Concerning your other (removed) post: I do agree doing calculations in the same thread as handling the GUI is not a very good approach, but unless there's somebody willing to do a tremendous amount of coding for Shareaza, we're stuck with what we have inherited from the past, and there's no way to change that. However as you seem to know quite well about what you're talking, I do invite you to have a look at our source code. Maybe you could post some optimizations to Shareaza yourself. (At least I for once think that investing the time in coding is better then using it up in forums ;) )

Re: Quadratic large-library slowdown can be improved

PostPosted: 07 Feb 2012 19:41
by ailurophobe
The problem is probably with the code that checks the library folders for new files and adds them to the list of files to be hashed. That is old legacy code. (Which is why adding new files blocks the main thread and impacts other applications.) The hashing itself and the library code have been updated later and should be just fine. Anyway, add the files to the library in smaller lots and it works.

@ryo:
If you are reading this could you add code that suspends checking for new files if there are more than 200 unhashed files and restarts it then the number of unhashed files drops below 100, also the UI should show "100+ files unhashed" when that is true instead of the actual number. Should be simpler than fixing the basic issue. (The numbers obviously just suggestions, I usually 500 to 600 at a time so larger numbers do work.)

Obviously anyone with Visual Studio can code the above suggestion, doesn't have to be ryo.

EDIT: Checked to be sure, but sqlite can handle large row numbers just fine. There is potential issue with file caching though. Sqlite databases are stored on the disk and if you do large numbers of file reads (often happens when hashing) this might keep pushing the database out of the file cache and drop the performance drastically over what it is when the accesses hit the cache. This might look like the quadratic behaviour you describe...