scsh-hackers
[Top] [All Lists]

Re: [Scsh-hackers] md5 for scsh

To: Martin Gasbichler <gasbichl@informatik.uni-tuebingen.de>
Subject: Re: [Scsh-hackers] md5 for scsh
From: Michel Schinz <Michel.Schinz@epfl.ch>
Date: Fri Jun 21 02:25:13 2002
Cc: scsh-hackers <scsh-hackers@lists.sourceforge.net>
List-id: Discussion among the implementors <scsh-hackers.lists.sourceforge.net>
Sender: scsh-hackers-admin@lists.sourceforge.net
Martin Gasbichler <gasbichl@informatik.uni-tuebingen.de> writes:

[...]

> The string returned by md5-digest->string is not the same as you
> would get from (number->string (md5-digest->number md5-digest)) but
> more compact.

Ok. Would by chance the string be encoded using some standard
encoding, like base64? If this is the case, wouldn't it be nice to
provide only "md5-digest->number" and then separate base64 (or
whatever) encoding functions? (I would certainly see the use of fast
base64 encoding/decoding in scsh).

> Michel> Also, I do not really see the aim of "string->md5-digest" and
> Michel> "number->md5-digest" since apparently the only thing you can do with
> Michel> the returned md5-digest is convert it back to a number or a string.
> 
> I just don't like the idea of numbers with special meanings floating
> around in the program. Scsh has enough of them already. You can't do
> anything reasonable with a md5-digest than reading, writing and
> comparing them, so we not help the programmer with enforcing this. 

I see... While I like the idea, I'm slightly disturbed by the fact
that you depart from the scsh tradition here, in my opinion, and that
could confuse the programmer. It would be interesting to see what
others think, though.

[...]

> Michel> It might be nice to provide an "md5-digest-for-file" procedure, taking
> Michel> a file-name as argument, I guess it's a pretty common operation.
> 
> Hmm, in the scsh tradition it should be
> 
> (md5-digest-for-file fname/port/fd [buffersize] -> md5-digest

That looks perfect.

> Michel> As a side-remark, I would add that sometimes one needs to perform
> Michel> checksums but without the cryptographic guarantees that MD5 gives. In
> Michel> these cases, one can use checksum algorithms which are much much
> Michel> faster than MD5. A good example is the FNV checksum algorithm [1],
> Michel> which appears to have a very low probability of collision, like MD5,
> Michel> while being faster and a lot easier to compute. Maybe also having this
> Michel> one in scsh would be nice.
> 
> Michel> [1] http://www.isthe.com/chongo/tech/comp/fnv/
> 
> Okay, I'll put that into my queue. I the new FFI stub generator my
> students are working on should already handle this...

Now that I think about it, wouldn't it be nice to provide a general
"digest" interface, which could be used to obtain digests with various
algorithms, be it MD5, FNV or others? For example, your
"md5-digest-for-file" function would be transformed to something like:

  (digest-for-file algorithm fname/port/fd)

where "algorithm" would be a symbol among "MD5", "FNV", etc. (or some
object returned by a function to create an encoder).

For low-level functions this might not be a good idea since they might
depend heavily on their algorithm, but I think that for high-level
functions it makes sense.

Michel.


<Prev in Thread] Current Thread [Next in Thread>