Improved Open Source Backup:
Incorporating inline deduplication and sparse indexing solutions
G. P. E. Keeling
< Prev
Contents
Next >
Appendix C - Rabin fingerprinting algorithm in C
This is a reproduction of the C function containing the
rabin algorithm that was produced for the proof of concept programs.
I judge this as being the most important outcome from that exercise, although
it was modified slightly during later iterations in order to support slightly
different usage required in the final software where asyncronous i/o had to be
taken into account.
Note that the full code of the prototype can be found in the extra materials
submitted with this report.
static int blk_read(struct rconf *rconf, FILE *ofp, uint64_t multiplier,
char *buf, char *buf_end, struct win *win,
struct blk **blkbuf, int *b)
{
char c;
char *cp;
struct blk *blk;
for(cp=buf; cp!=buf_end; cp++)
{
blk=blkbuf[*b];
c=*cp;
blk->fingerprint = (blk->fingerprint * rconf->prime) + c;
win->checksum = (win->checksum * rconf->prime) + c
- (win->data[win->pos] * multiplier);
win->data[win->pos] = c;
win->pos++;
win->total_bytes++;
blk->data[blk->length++] = c;
if(win->pos == rconf->win) win->pos=0;
if( blk->length >= rconf->blk_min
&& (blk->length == rconf->blk_max
|| (win->checksum % rconf->blk_avg) == rconf->prime))
{
(*b)++;
if(*b<SIG_MAX) continue;
if(blks_output(rconf, ofp, blkbuf, b)) return -1;
}
}
return 0;
}
< Prev
Contents
Next >
|