ED2K: certain files corrupted in transit EVERY time?
Posted: 23 Mar 2014 20:56
I'm running into a problem lately where certain files seem to be being corrupted in transit consistently. By "corrupted in transit", I mean BOTH that the received file has a different hash from the source's hash (notably, the downloaded file doesn't disappear from the search results with "files you have already" filtered, nor does re-downloading it prompt that you have the file already in your library, nor does it turn green, nor does it turn brown if the downloaded file is subsequently deleted) AND that the received file will not play back successfully (it's totally unusable).
This seems to be happening under very specific (and somewhat peculiar) circumstances:
a) The download is via ED2K protocol.
b) The source has lowID.
c) The file name begins with the string "__ARESTRA__".
It's that last bit that's peculiar. I could see a protocol implementation bug being specific to one protocol, and I could see it being specific to push sources (or to non-push sources), but why would it be allergic to a specific filename pattern?
The files themselves are fairly diverse. I've found "ARESTRA" files matching several unrelated searches. All have the problem that they cannot be downloaded intact from ED2K lowID sources using Shareaza -- the received file will not be bit-identical to the source's copy and will not operate. Every single time. It seems to be the filename itself, not the specific source or the specific subject matter (original file's contents), that sets this problem off. Indeed, I've sometimes seen what are presumably near-identical files (same name with minor variations like spaces->underscores or whatever and releaser prefixes/suffixes added/changed/removed, and presumably the same content with slight encoding quality differences, with similar file sizes) that downloaded intact; what makes the difference is not understores, hyphens, the actual subject matter, the actual content, or anything else, just whether or not the name starts with "__ARESTRA__".
And I think I've had "ARESTRA" files download intact from non-ED2K sources in the past (sometimes with minor pre-existing corruption, but none in transit, i.e. the downloaded file matched the source's shared file and the search result disappeared or turned green, etc.)
So: Why does the combo of that filename pattern, ED2K, and a lowID source consistently result in the file being garbled over the wire? I'll emphasize that these days I pretty much never see a file arrive not bit-identical to the source's copy except in this one specific pattern of circumstances. The use of hashes to verify and redownload (parts of) files on every network has made "garbled over the wire" a phenomenon almost never seen any more in a completed file. Except, somehow, for "ARESTRA" files from lowID ED2K sources.
This seems to be happening under very specific (and somewhat peculiar) circumstances:
a) The download is via ED2K protocol.
b) The source has lowID.
c) The file name begins with the string "__ARESTRA__".
It's that last bit that's peculiar. I could see a protocol implementation bug being specific to one protocol, and I could see it being specific to push sources (or to non-push sources), but why would it be allergic to a specific filename pattern?
The files themselves are fairly diverse. I've found "ARESTRA" files matching several unrelated searches. All have the problem that they cannot be downloaded intact from ED2K lowID sources using Shareaza -- the received file will not be bit-identical to the source's copy and will not operate. Every single time. It seems to be the filename itself, not the specific source or the specific subject matter (original file's contents), that sets this problem off. Indeed, I've sometimes seen what are presumably near-identical files (same name with minor variations like spaces->underscores or whatever and releaser prefixes/suffixes added/changed/removed, and presumably the same content with slight encoding quality differences, with similar file sizes) that downloaded intact; what makes the difference is not understores, hyphens, the actual subject matter, the actual content, or anything else, just whether or not the name starts with "__ARESTRA__".
And I think I've had "ARESTRA" files download intact from non-ED2K sources in the past (sometimes with minor pre-existing corruption, but none in transit, i.e. the downloaded file matched the source's shared file and the search result disappeared or turned green, etc.)
So: Why does the combo of that filename pattern, ED2K, and a lowID source consistently result in the file being garbled over the wire? I'll emphasize that these days I pretty much never see a file arrive not bit-identical to the source's copy except in this one specific pattern of circumstances. The use of hashes to verify and redownload (parts of) files on every network has made "garbled over the wire" a phenomenon almost never seen any more in a completed file. Except, somehow, for "ARESTRA" files from lowID ED2K sources.