EDKAich
From MLDonkey
Contents |
Feature description
AICH hashes are based on SHA1
Detailed feature description
Savannah tracker
http://savannah.nongnu.org/task/index.php?6396
Code snippets
opcodes.h
#define OP_AICHREQUEST 0x9B // <HASH 16><uint16><HASH aichhashlen> #define OP_AICHANSWER 0x9C // <HASH 16><uint16><HASH aichhashlen> <data> #define OP_AICHFILEHASHANS 0x9D #define OP_AICHFILEHASHREQ 0x9E
SHAHashSet.cpp
// for this version the limits are set very high, they might be lowered later // to make a hash trustworthy, at least 10 unique Ips (255.255.128.0) must have send it // and if we have received more than one hash for the file, one hash has to be send by more than 95% of all unique IPs #define MINUNIQUEIPS_TOTRUST 10 // how many unique IPs most have send us a hash to make it trustworthy #define MINPERCENTAGE_TOTRUST 92 // how many percentage of clients most have sent the same hash to make it trustworthy
SHAHashSet.h
/*
SHA haset basically exists of 1 Tree for all Parts (9.28MB) + n Trees
for all blocks (180KB) while n is the number of Parts.
This means it is NOT a complete hashtree, since the 9.28MB is a given level, in order
to be able to create a hashset format similar to the MD4 one.
If the number of elements for the next level are odd (for example 21 blocks to spread into 2 hashs)
the majority of elements will go into the left branch if the parent node was a left branch
and into the right branch if the parent node was a right branch. The first node is always
taken as a left branch.
Example tree:
FileSize: 19506000 Bytes = 18,6 MB
X (18,6) MasterHash
/ \
X (18,55) \
/ \ \
X(9,28) x(9,28) X (0,05MB) PartHashs
/ \ / \ \
X(4,75) X(4,57) X(4,57) X(4,75) \
[...............]
X(180KB) X(180KB) [...] X(140KB) | X(180KB) X(180KB [...] BlockHashs
v
Border between first and second Part (9.28MB)
HashsIdentifier:
When sending hashs, they are send with a 16bit identifier which specifies its postion in the
tree (so StartPosition + HashDataSize would lead to the same hash)
The identifier basically describes the way from the top of the tree to the hash. a set bit (1)
means follow the left branch, a 0 means follow the right. The highest bit which is set is seen as the start-
postion (since the first node is always seend as left).
Example
x 0000000000000001
/ \
x \ 0000000000000011
/ \ \
x _X_ x 0000000000000110
Version 2 of AICH also supports 32bit identifiers to support large files, check CAICHHashSet::CreatePartRecoveryData
*/
Some design thoughts
[16:40:40] <spiralvoice> pango: or create a new subdir in $MLDONKEY_DIR and keep ini files with aich hashes there named after the md4 hash [16:40:52] <spiralvoice> pango: one ini file per md4 hash [16:40:59] <pango> yes, that's a possibility [16:41:32] <pango> not too efficient, but should at least work [16:48:29] <pango> parsing, the packet computation, then garbage collection of the hashes will take some time and cpu... maybe we'll need to kind of throttling to avoid DoSes