Developers.Hash

Hash

Documenting hashing in Shareaza 2.0 and camper's new code.

Getting started Features of this part of the program Setting up the projects for testing List of files and what they do what is the complete list of files?

Hashing data The hash value types what does each thing do? How to hash data Let's hash some data Watching a file get hashed by the library The Tiger and eDonkey2000 full file hashes The HashData function

Text encoding and URIs Getting the hashes as text is there a way to get the missing strings? Hash and GUID text and URI formats output in each format Testing hash to text and back again in camper's code Hash to text and back again in the 2.0 code

Tree hashes TigerTree documented on the Web What TigerTree can do what are all the sizes? TigerTree on the test file Watching Shareaza 2.0 do TigerTree Verifying file blocks in Shareaza 2.0 TigerTree math Math results Documenting TigerTree Inside TigerTree do Transmitting the middle of a tree makes no sense

Server and client sample Figuring out how big the tree will be Running test files through it Invalid trees Invalid blocks The client shortens the tree height

GUIDs GUIDs GUID hunt in Shareaza 2.0 GUID hunt in camper's code Copying GUID values into camper's GUID type Playing with GUIDs in camper's code Special GUID requirements

Design How file fragments are handled A hashing library Refactoring hashing Buffers of different sizes Refactoring CBuffer Renaming types An easy to use interface for hashing Zootella test project First refactor Zootella.WithoutCCompress

Notes

doing right now write little sample functions that demonstrate how to do hashing and guid jobs in the 2.0 and camper code isolate the 2.0 code in the zootella test project and refactor it refactor czlib and cbuffer, then create the socket with features class comment the alpha code everywhere hashes and guids are used

find where sharaza 2.0 gets a fragment, and checks if its valid based on a tigertree hash search google more to find out how tigertree works

copy all questions up to a single list here write the 2.0 text encoding and uri header documentation

in camper's code, find out what else is next to where md4 is located, find the second list figure out how to use the other types in camper's code, like

undiscovered features in camper's code use the managed hash types use tiger and ed2k to check in file fragments use the 3 bittorrent hash types use the 2 guid types

see where his code uses them

ideas for camper's hashing code

code

commented and documented code snippets for each use
refactor buffer and add encoding
assembly switch
camper library

comments

documentation comments on every line
method-by-method documentation in the wiki

documentation

how hashing works
hashing and peer-to-peer networks
a policy-based interface to hashing
guids and peer-to-peer networks
guids in shareaza

Developers.Hash.Notes2

write code snippets that get the size of each hash compute a hash each way convert between binary and text encoding in each base make a guid do something wrong, and catch an exception check parts into a file with tigertree check parts into a file the edonkey2000 way

new fundamental data types the code introduces, like the hash values and guid, and their size guids are included also, how to use them for uses outside hashes, how to encode and decode them integrate your base functions, eliminate code duplication, allow arbitrary length, simple function call each block of assembly code next to c++ code, and a global switch to choose between them

Where the code that actually computes each kind of hash is located

Methods for text encoding and URI prefixes Hashing and eDonkey2000

Hashing and BitTorrent

each of these talks about

web research, how hashing works
how peer to peer networks use these hashes
how shareaza use these hashes
where and how the shareaza 2.0 code does this
where camper's new code does the same thing
how this could be refactored

how hashing works, how tigertree works, hashing and the peer-to-peer networks, separate from shareaza and this code
design document, "a policy-based interface to hashing using generic programming", the invention of the design, why this is such a cool design for hashing
little code snippets that demonstrate each thing the hashing code does, their comments, and their documentation
comments on every line of code, and method-by-method documentation in the wiki

This is what I am trying to do right now with hashing:

Comment and document the hashing interface that the rest of Shareaza uses
Bring back the C++ 2.0 code to put in the bUseAssemblyBlocks global switch
Write a few lines of code that take some files, and compute their hashes
Make a little sample app that just contains the hashing code

do next compile the 2.0 and cvs code side by side, compute hashes of files, see if they turn out the same

questions is SHA the same thing as SHA1? have I found the right files in the 2.0 code?

accomplish little documented sample code pieces that -hash data the different ways -do the file-based ed2k and tiger hashes -do whatever camper's template code does

comment and document camper's code before it even hits the head branch

organized by subject, not by file, it would be

where the code that actually computes a hash is
the hash value types
the objects that compute hashes
text encoding and uri header methods
ed2k
tigertree
bittorrent

These are data types. How many different data types are there? Are these all of them? Where is MD4? Is that just Ed2K? What's the difference between a hash and a managed hash?

Developers.Hash

Hash

Notes

Navigation menu