Developers.Hash
Hash
Documenting hashing in Shareaza 2.0 and camper's new code.
Getting started Features of this part of the program Setting up the projects for testing List of files and what they do what is the complete list of files?
Hashing data The hash value types what does each thing do? How to hash data Let's hash some data Watching a file get hashed by the library The Tiger and eDonkey2000 full file hashes The HashData function
Text encoding and URIs Getting the hashes as text is there a way to get the missing strings? Hash and GUID text and URI formats output in each format Testing hash to text and back again in camper's code Hash to text and back again in the 2.0 code
Tree hashes TigerTree documented on the Web What TigerTree can do what are all the sizes? TigerTree on the test file Watching Shareaza 2.0 do TigerTree Verifying file blocks in Shareaza 2.0 TigerTree math Math results Documenting TigerTree Inside TigerTree do Transmitting the middle of a tree makes no sense
Server and client sample Figuring out how big the tree will be Running test files through it Invalid trees Invalid blocks The client shortens the tree height
GUIDs GUIDs GUID hunt in Shareaza 2.0 GUID hunt in camper's code Copying GUID values into camper's GUID type Playing with GUIDs in camper's code Special GUID requirements
Design How file fragments are handled A hashing library Refactoring hashing Buffers of different sizes Refactoring CBuffer Renaming types An easy to use interface for hashing Zootella test project First refactor Zootella.WithoutCCompress
Notes
doing right now write little sample functions that demonstrate how to do hashing and guid jobs in the 2.0 and camper code isolate the 2.0 code in the zootella test project and refactor it refactor czlib and cbuffer, then create the socket with features class comment the alpha code everywhere hashes and guids are used
find where sharaza 2.0 gets a fragment, and checks if its valid based on a tigertree hash search google more to find out how tigertree works
copy all questions up to a single list here write the 2.0 text encoding and uri header documentation
in camper's code, find out what else is next to where md4 is located, find the second list figure out how to use the other types in camper's code, like
undiscovered features in camper's code use the managed hash types use tiger and ed2k to check in file fragments use the 3 bittorrent hash types use the 2 guid types
see where his code uses them
ideas for camper's hashing code
code
- commented and documented code snippets for each use
- refactor buffer and add encoding
- assembly switch
- camper library
comments
- documentation comments on every line
- method-by-method documentation in the wiki
documentation
- how hashing works
- hashing and peer-to-peer networks
- a policy-based interface to hashing
- guids and peer-to-peer networks
- guids in shareaza
write code snippets that get the size of each hash compute a hash each way convert between binary and text encoding in each base make a guid do something wrong, and catch an exception check parts into a file with tigertree check parts into a file the edonkey2000 way
new fundamental data types the code introduces, like the hash values and guid, and their size guids are included also, how to use them for uses outside hashes, how to encode and decode them integrate your base functions, eliminate code duplication, allow arbitrary length, simple function call each block of assembly code next to c++ code, and a global switch to choose between them
Where the code that actually computes each kind of hash is located
Methods for text encoding and URI prefixes Hashing and eDonkey2000
Hashing and BitTorrent
each of these talks about
- web research, how hashing works
- how peer to peer networks use these hashes
- how shareaza use these hashes
- where and how the shareaza 2.0 code does this
- where camper's new code does the same thing
- how this could be refactored
- how hashing works, how tigertree works, hashing and the peer-to-peer networks, separate from shareaza and this code
- design document, "a policy-based interface to hashing using generic programming", the invention of the design, why this is such a cool design for hashing
- little code snippets that demonstrate each thing the hashing code does, their comments, and their documentation
- comments on every line of code, and method-by-method documentation in the wiki
This is what I am trying to do right now with hashing:
- Comment and document the hashing interface that the rest of Shareaza uses
- Bring back the C++ 2.0 code to put in the bUseAssemblyBlocks global switch
- Write a few lines of code that take some files, and compute their hashes
- Make a little sample app that just contains the hashing code
do next
compile the 2.0 and cvs code side by side, compute hashes of files, see if they turn out the same
questions is SHA the same thing as SHA1? have I found the right files in the 2.0 code?
accomplish little documented sample code pieces that -hash data the different ways -do the file-based ed2k and tiger hashes -do whatever camper's template code does
comment and document camper's code before it even hits the head branch
organized by subject, not by file, it would be
- where the code that actually computes a hash is
- the hash value types
- the objects that compute hashes
- text encoding and uri header methods
- ed2k
- tigertree
- bittorrent
These are data types. How many different data types are there? Are these all of them? Where is MD4? Is that just Ed2K? What's the difference between a hash and a managed hash?