In article <izhggvrukk.fsf@mocha.CS.Princeton.EDU>,
Matthias Blume <blume@mocha.cs.princeton.edu> wrote:
>In article <33607944.3852@ws6a17.gud.siemens.co.at> "Istvan.Simon"
><simo@ws6a17.gud.siemens.co.at> writes:
>
> Even if two things LOOK the same then they are not the same. If they
> were the same you could never make a copy from anything. If this is
> true in reality then there is no reason to be untrue in programming
> languages.
>
>Several misconceptions:
>
> - There are things in the "real" world that don't have an "identity".
> Numbers are one example, and we generally don't expect to be able
> to distinguish between _this_ 1 and _that_ 1. Another example is the
> electron (or other thingies that particle physicists might be interested
> in).
Hmmm... I dunno about electrons, and I'm not sure it's relevant,
but I'd say that for most purposes 1 *does* have an identity. It's
1, you know, the number 1. The reason why you can't copy the number
1 and distinguish between the copies is that it makes no sense
mathematically. But that doesn't mean there isn't a single,
distinguished conceptual object 1.
If Lisp and Scheme had completely wonderful type systems, you would
be able to ensure that 1 is always eq? to 1. The reason why you
can't is just an efficiency hack---you don't want to have to have
a hash table of all numeric values to maintain object identity,
the way you do with symbols. It'd just be too slow. That's
why Scheme has the predicate eqv? and CL has an equivalent
predicate. (eql? ? I forget.) These predicates smoothover
the fact that an implementation is allowed to make multiple
copies of a numeric value, and make it appear that multiple
copies are the same value.
I'm not saying that the default, language-supported notion of
object identity is always the right one for your purposes, but
it often is for mine. I'd say ML and Scheme both made reasonable
choices on this. It's nice to have both, whichever is the default.
> - Things that _actually_ look the same in each and every
> respect_are_ the same.
No. This is bad philosopy. Things that are indistinguishable
given a certain subjective point of view may become distinguishable
when you learn more about them.
(For example, I once learned that I'd been corresponding with two
different people named David Chapman. Likewise, things that are
indistingishable at one time may later become distinguishable,
as when it was realized that the morning star and the evening
star were the same object.)
>Things that we can distinguish between
> do _not_ look the same (by definition -- this is what we mean by
> being able to distinguish).
I think this is simplistic. I may be able to distinguish between
two identical twins based on their social security numbers, and
that may be my only means of knowing they're two separate people,
at some point in time. Similar things come up in programming
tasks.
> - Even if, for a moment, we assume that all "real world" things have an
> inherent identity (the electron is a nice counterexample), then
> it is _still_ a bad idea to extrapolate from this and make every
> value in a programming language the same. Abstract things (like
> numbers, functions, sets) do _not_ have an identity.
I don't think "counterexamples" are the right thing to work from
here. I wouldn't deny that often the language-supported notion of
object identity is often the wrong one for representing conceptual
object identity. But it is often useful, even if you can always
hack up something to do the job in a language without object
identity.
> - From a denotational point of view object identity doesn't matter
> for immutable things.
It depends on the semantics you need. For example, if I'm representing
two individuals who are parents of a third individual, and I don't
really know anything else about them, I do not want to make the mistake
of thinking that they're the same person---one may be the father,
and the other the mother, so that they're entirely different
individuals.
Similar things come up all the time internally to programs, typically
when you come up with an abstraction.
> - Exactly _because_ there is no notion of identity, and because
> things that look the same _are_ the same, it is possible for the
> compiler to make copies of things (or represent them differently
> at different times during execution). This is of great value for
> optimizing compilers.
I think this has much more to do with immutability than with object
identity (or the lack of it).
> You always have a memory address even you don't want to have one.
> This address will identify your object even if you don't want it.
>
>See, here is the mistake.
I think the issue here is not so much that you have an object identity
when you don't want one, but that you've got to know whether you want
one, for the semantics you're imposing on your data structures.
If an eq? (object identity) test isn't want you want, _don't_use_it_.
Cobble up your own predicates, as you'd do in a language without
object identity. (Either way, don't confuse a memory address with
a language-level pointer. They're not the same idea at all, even
though there's often a convenient mapping that the language implementation
*may* choose to do.)
>If you don't have the concept of identity,
>then the compiler can choose to duplicate things when it needs to, it
>can keep things in registers instead of bundling them up and allocate
>them on the heap,
Again, I think this is more of a point about immutability rather
than object identity. You can have both in one language.
> it can use a hash-consing GC that identifies
>lookalikes the programmer didn't think of and represent them by the
>same "pointer" internally, and so on. Once you do any of this, you
>don't have _one_ pointer that identifies your value -- there might be
>two, or ten, or none at all.
I think we have a high-level vs. low-level thing going on here.
If I want to efficiently implement exactly the memoization and
hash-consing I want for my application, I'd like to have a language
with pointers so that I can do it very efficiently. Then I can
proceed to program with those pieces I built, construing object
identity or the lack of it any way that makes sense. Usually I
don't want a language that tells me there's no such thing as object
identity. I can hack around it, but why bother?
BTW, I'm not at all convinced that a language-level pointer abstraction
is at all expensive to implement. If I was, I might lean more toward
forcing the programmer to cobble one up as necessary.
>--
>-Matthias
--
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and Scheme interpreters and compilers available via ftp from
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)
|