Developers.Hash

From Shareaza Wiki
Revision as of 02:33, 1 December 2009 by Cyko 01 (talk | contribs) (removed zootella links)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Hash

Documenting hashing in Shareaza 2.0 and camper's new code.

Getting started Features of this part of the program Setting up the projects for testing List of files and what they do what is the complete list of files?

Hashing data The hash value types what does each thing do? How to hash data Let's hash some data Watching a file get hashed by the library The Tiger and eDonkey2000 full file hashes The HashData function

Text encoding and URIs Getting the hashes as text is there a way to get the missing strings? Hash and GUID text and URI formats output in each format Testing hash to text and back again in camper's code Hash to text and back again in the 2.0 code

Tree hashes TigerTree documented on the Web What TigerTree can do what are all the sizes? TigerTree on the test file Watching Shareaza 2.0 do TigerTree Verifying file blocks in Shareaza 2.0 TigerTree math Math results Documenting TigerTree Inside TigerTree do Transmitting the middle of a tree makes no sense

Server and client sample Figuring out how big the tree will be Running test files through it Invalid trees Invalid blocks The client shortens the tree height

GUIDs GUIDs GUID hunt in Shareaza 2.0 GUID hunt in camper's code Copying GUID values into camper's GUID type Playing with GUIDs in camper's code Special GUID requirements

Design How file fragments are handled Refactoring hashing Buffers of different sizes Refactoring CBuffer Renaming types An easy to use interface for hashing Zootella test project First refactor

Notes

doing right now write little sample functions that demonstrate how to do hashing and guid jobs in the 2.0 and camper code isolate the 2.0 code in the zootella test project and refactor it refactor czlib and cbuffer, then create the socket with features class comment the alpha code everywhere hashes and guids are used

find where sharaza 2.0 gets a fragment, and checks if its valid based on a tigertree hash search google more to find out how tigertree works

copy all questions up to a single list here write the 2.0 text encoding and uri header documentation

in camper's code, find out what else is next to where md4 is located, find the second list figure out how to use the other types in camper's code, like

undiscovered features in camper's code use the managed hash types use tiger and ed2k to check in file fragments use the 3 bittorrent hash types use the 2 guid types

see where his code uses them


ideas for camper's hashing code

code

  • commented and documented code snippets for each use
  • refactor buffer and add encoding
  • assembly switch
  • camper library

comments

  • documentation comments on every line
  • method-by-method documentation in the wiki

documentation

  • how hashing works
  • hashing and peer-to-peer networks
  • a policy-based interface to hashing
  • guids and peer-to-peer networks
  • guids in shareaza

Developers.Hash.Notes2

write code snippets that get the size of each hash compute a hash each way convert between binary and text encoding in each base make a guid do something wrong, and catch an exception check parts into a file with tigertree check parts into a file the edonkey2000 way

new fundamental data types the code introduces, like the hash values and guid, and their size guids are included also, how to use them for uses outside hashes, how to encode and decode them integrate your base functions, eliminate code duplication, allow arbitrary length, simple function call each block of assembly code next to c++ code, and a global switch to choose between them


Where the code that actually computes each kind of hash is located

Methods for text encoding and URI prefixes Hashing and eDonkey2000

Hashing and BitTorrent


each of these talks about

  • web research, how hashing works
  • how peer to peer networks use these hashes
  • how shareaza use these hashes
  • where and how the shareaza 2.0 code does this
  • where camper's new code does the same thing
  • how this could be refactored


  • how hashing works, how tigertree works, hashing and the peer-to-peer networks, separate from shareaza and this code
  • design document, "a policy-based interface to hashing using generic programming", the invention of the design, why this is such a cool design for hashing
  • little code snippets that demonstrate each thing the hashing code does, their comments, and their documentation
  • comments on every line of code, and method-by-method documentation in the wiki


This is what I am trying to do right now with hashing:

  • Comment and document the hashing interface that the rest of Shareaza uses
  • Bring back the C++ 2.0 code to put in the bUseAssemblyBlocks global switch
  • Write a few lines of code that take some files, and compute their hashes
  • Make a little sample app that just contains the hashing code


do next compile the 2.0 and cvs code side by side, compute hashes of files, see if they turn out the same

questions is SHA the same thing as SHA1? have I found the right files in the 2.0 code?

accomplish little documented sample code pieces that -hash data the different ways -do the file-based ed2k and tiger hashes -do whatever camper's template code does

comment and document camper's code before it even hits the head branch


organized by subject, not by file, it would be

  • where the code that actually computes a hash is
  • the hash value types
  • the objects that compute hashes
  • text encoding and uri header methods
  • ed2k
  • tigertree
  • bittorrent

These are data types. How many different data types are there? Are these all of them? Where is MD4? Is that just Ed2K? What's the difference between a hash and a managed hash?