EDKAich

From MLDonkey
Jump to: navigation, search

Contents

Feature description

AICH hashes are based on SHA1
Detailed feature description

Savannah tracker

http://savannah.nongnu.org/task/index.php?6396

Code snippets

opcodes.h

#define OP_AICHREQUEST			0x9B	// <HASH 16><uint16><HASH aichhashlen>
#define OP_AICHANSWER			0x9C	// <HASH 16><uint16><HASH aichhashlen> <data>
#define OP_AICHFILEHASHANS		0x9D	  
#define OP_AICHFILEHASHREQ		0x9E

SHAHashSet.cpp

// for this version the limits are set very high, they might be lowered later
// to make a hash trustworthy, at least 10 unique Ips (255.255.128.0) must have send it
// and if we have received more than one hash  for the file, one hash has to be send by more than 95% of all unique IPs
#define MINUNIQUEIPS_TOTRUST		10	// how many unique IPs most have send us a hash to make it trustworthy
#define	MINPERCENTAGE_TOTRUST		92  // how many percentage of clients most have sent the same hash to make it trustworthy

SHAHashSet.h

/* 
 SHA haset basically exists of 1 Tree for all Parts (9.28MB) + n  Trees
 for all blocks (180KB) while n is the number of Parts.
 This means it is NOT a complete hashtree, since the 9.28MB is a given level, in order
 to be able to create a hashset format similar to the MD4 one.

 If the number of elements for the next level are odd (for example 21 blocks to spread into 2 hashs)
 the majority of elements will go into the left branch if the parent node was a left branch
 and into the right branch if the parent node was a right branch. The first node is always
 taken as a left branch.

Example tree:
	FileSize: 19506000 Bytes = 18,6 MB

								X (18,6)                                   MasterHash
							 /     \
						 X (18,55)   \
					/		\	       \
                   X(9,28)  x(9,28)   X (0,05MB)						   PartHashs
			   /      \    /       \        \
		X(4,75)   X(4,57) X(4,57)  X(4,75)   \

						[...............]
X(180KB)   X(180KB)  [...] X(140KB) | X(180KB) X(180KB [...]			   BlockHashs
									v
						 Border between first and second Part (9.28MB)

HashsIdentifier:
When sending hashs, they are send with a 16bit identifier which specifies its postion in the
tree (so StartPosition + HashDataSize would lead to the same hash)
The identifier basically describes the way from the top of the tree to the hash. a set bit (1)
means follow the left branch, a 0 means follow the right. The highest bit which is set is seen as the start-
postion (since the first node is always seend as left).

Example

								x                   0000000000000001
							 /     \		
						 x		    \				0000000000000011
					  /		\	       \
                    x       _X_          x 	        0000000000000110


Version 2 of AICH also supports 32bit identifiers to support large files, check CAICHHashSet::CreatePartRecoveryData


*/

Some design thoughts

[16:40:40] <spiralvoice> pango: or create a new subdir in $MLDONKEY_DIR and keep ini files with aich hashes
there named after the md4 hash
[16:40:52] <spiralvoice> pango: one ini file per md4 hash
[16:40:59] <pango> yes, that's a possibility
[16:41:32] <pango> not too efficient, but should at least work
[16:48:29] <pango> parsing, the packet computation, then garbage collection of the hashes will take some time and cpu...
maybe we'll need to kind of throttling to avoid DoSes
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox