Developers.Hash.TigerTree.Demo

From Shareaza Wiki
Jump to navigation Jump to search

Verifying file blocks in Shareaza 2.0

I've got a sample working that shows how to verify file blocks with TigerTree in the Shareaza 2.0 code.

Here's the code: Developers.Hash.TigerTree.Demo.Code Here are test files to run through it: [1]

Features

The function TestTiger opens the disk file, and copies its contents into a CBuffer object called file. It calls HashTiger, which computes and returns the TigerTree root hash and the entire tree. These steps simulate a server that has the file and is hashing it.

Back in TestTiger, the function creates a new CTigerTree object and gives it the entire tiger tree. A loop breaks the file into blocks, and calls BeginBlockTest, AddToTest, and FinishBlockTest on each to validate it. These steps simulate a client downloading file fragments and confirming that they are good.

TestTiger simulates corrupted data to show how test can fail. If the hash tree is bad, FromBytes will return FALSE. If a file block is bad, FinishBlockTest will return FALSE. Most of the time, though, the data will be good, and these methods will return TRUE.

Using methods on a CTigerTree object

Use BeginFile, AddToFile, and FinishFile to hash a file.

<source lang="c"> // Compute the TigerTree root hash for the file CTigerTree hash; // The object that will compute the hash hash.BeginFile(treeheight, bytes); // Tell it the maximum tree height, and the file size hash.AddToFile(p, bytes); // Give it the data to hash hash.FinishFile(); // That's all </source>

BeginFile takes the maximum tree height. This keeps it from making large trees on very large files. In Shareaza, the maximum tree height is 9. BeginFile also needs to know how big the file is going to be. Call AddToFile to feed the object the bytes of the file. You can do this with several calls to AddToFile even though this sample just shows one. When the whole file has been read in, call FinishFile.

After this, the CTigerTree object has calculated the root hash and the whole tree of hashes, and is holding them inside. Next, we'll get them out.

<source lang="c"> // Get the TigerTree root hash value TIGEROOT value; // The variable that will store the hash value DWORD size = sizeof(value); // 24 bytes hash.GetRoot(&value); // Get the hash value from the object

// Copy the whole tree into the given buffer DWORD treesize = hash.GetSerialSize(); // The tree will be smaller than this tree->EnsureBuffer(treesize); // Make enough room BOOL result = hash.ToBytes(&(tree->m_pBuffer), &treesize, 9); // Writes how much it wrote in treesize tree->m_nLength += treesize; // Report how much it wrote </source>

Make a TIGEROOT variable, and give it to GetRoot. A TigerTree root hash is only 24 bytes, so it's small enough to turn into base 16 text. The whole tree is larger. To hold it, we'll use a CBuffer object. Call GetSerialSize to find out how much space we need. GetSerialSize returns a number a little larger than is necessary. Then, call ToBytes to have the object give us the whole tree.

  • Is GetSerialSize really the right way to find out how much space ToBytes needs? Looking at it now, I think it returns a greater size because it's reporting the size that Serialize will write, which will include the tree and some more information, making it larger. Using it this way is a hack. I need to find the right way to do this.
  • This should be refactored to take a CBuffer object directly. It's annoying to have to call everything twice, first to find out how much space it needs, then to give it that much. CBuffer is a fundamental type in Shareaza, and should be integrated everywhere it is useful.

Now, imagine we have a tiger tree, but not the file. We're downloading fragments and want to know if they're good. Here are the CTigerTree methods to use to do that.

<source lang="c"> // Make a new TigerTree object and give it the whole tree CTigerTree tiger; // This new TigerTree object doesn't know about the file yet BOOL result; // FromBytes will return true if the tiger tree we give it is valid result = tiger.FromBytes( // FromBytes takes a tiger tree from memory and loads it into the object

   tree.m_pBuffer,       // Pointer to the tree
   tree.m_nLength,       // Size of the tree, the number of bytes FromBytes should read there
   treeheight,           // Maximum tree height, Shareaza uses 9
   file.m_nLength);      // FromBytes also needs to know how big the file is

</source>

Make a new CTigerTree object, and give it the whole tree with the method FromBytes. This method also needs to know the maximum tree depth used to create the tree, and the file size. With the tree loaded in our new CTigerTree object, we can start handing it file fragments and it will tell us if they are good.

<source lang="c"> // Test the block tiger.BeginBlockTest(); tiger.AddToTest(block, blockfilled); // The block to test result = tiger.FinishBlockTest(blocknumber); // What block number that was </source>

BeginBlockTest starts the process. AddToTest takes the file fragment to test. FinishBlockTest takes the block number that we just gave AddToTest. It returns TRUE if the block matches the tree and FALSE if it doesn't.

  • How does the object really know what block size we are using? What happens if you start the test by giving it the last block, which will have a blockfilled of less than the true block size, and a high blocknumber? It would make more sense if BeginBlockTest took the blocksize, but somehow it manages without that piece of information.