scsh-users
[Top] [All Lists]

Re: Object IDs are good ( was: Object IDs are bad )

To: scsh-news@martigny.ai.mit.edu
Subject: Re: Object IDs are good ( was: Object IDs are bad )
From: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: Wed, 30 Apr 1997 17:15:05 +0200
Organization: [posted via] Leibniz-Rechenzentrum, Muenchen (Germany)
Matthias Blume wrote:
> 
> In article <33607944.3852@ws6a17.gud.siemens.co.at> "Istvan.Simon" 
> <simo@ws6a17.gud.siemens.co.at> writes:
> 
>    Even if two things LOOK the same then they are not the same. If they
>    were the same you could never make a copy from anything. If this is
>    true in reality then there is no reason to be untrue in programming
>    languages.
> 
> Several misconceptions:
> 
>    - There are things in the "real" world that don't have an "identity".
>      Numbers are one example, and we generally don't expect to be able
>      to distinguish between _this_ 1 and _that_ 1.  Another example is the
>      electron (or other thingies that particle physicists might be interested
>      in).

Nice analogy - especially as it can be used to prove that is *does* make
sense to obtain a pointer to an object:

While electrons are indistinguishable (and this is an important concept
in
quantum mechanics), it *does* make sense to speak of "the electron in
the 1s
state" vs. "the electron in the 2p state", because there is the
so-called
Pauli principle, that no two electrons can be in the same state
(ironically
this is true *only* for identical particles). In the same sense it makes
sense
to speak of "the object at this place" and "the object at that place",
as
again no two objects can be at the same place. Of course, if you would
exchange
those identical objects in memory (without updating references to them),
you
would see no difference - just as you would see no difference if you
exchanged
the two electrons above. It's the electrons "identified" by their state,
and
the objects "identified" by their address.

> 
>    - Things that _actually_ look the same in each and every
>      respect_are_ the same.  Things that we can distinguish between
>      do _not_ look the same (by definition -- this is what we mean by
>      being able to distinguish).
> 

Even identical objects may be distinguished by their relationship to the
rest
of the world. For example replace the two electrons of the example by
two
myons. Then it *would* make a difference, if the 1s myon or the 2p myon
decayed (although both are "identical"). In the same sense it makes a
difference
f. ex. *which* of the identical subtrees you replace - you just get
different
trees by removing different subtrees, even if they are identical. Or,
simpler,
take the following list: (1, 1, 2, 3). Now it *does* make a difference
if I
replace the first 1 or the second 1 by 2. So we *can* distinguish
between
*this* 1 and *that* 1 (this 1 being the first in the list, that 1 being
the
second)

>    - Even if, for a moment, we assume that all "real world" things have an
>      inherent identity (the electron is a nice counterexample), then
>      it is _still_ a bad idea to extrapolate from this and make every
>      value in a programming language the same. Abstract things (like
>      numbers, functions, sets) do _not_ have an identity.
> 

Such like the first 1 and the second 1 in the list above?

>    - From a denotational point of view object identity doesn't matter
>      for immutable things.
> 

Ok, then I'll change the task: Create a new list with the first 1 or the
second 1 replaced by 2. Now, no difference?

>    - Exactly _because_ there is no notion of identity, and because
>      things that look the same _are_ the same, it is possible for the
>      compiler to make copies of things (or represent them differently
>      at different times during execution).  This is of great value for
>      optimizing compilers.
> 

But *because* there is an identity, the compiler is not free to move
things
around without updating all references to them (else it could happen
that
composed objects *do* change by accident). And this in turn means that
the
objects from the program view are *not* moved around.
Or to say it a different way: A pointer is an object that *points* to
another
object. By doing this, it produces a *relationship* between itself and
this
object. And this relationship makes the object distuingishable from
other
"identical" objects. The fact that pointers are most easily are
implemented as
physical memory address does nothing to the concept (in fact, the
"address"
in modern processors isn't really the address in memory - the object
might
even be swapped out of memory).

For this reason, a perfectly valid C++ implementation would be allowed
to
store f.ex. names in pointers, which then could be looked up in a hash
table
to find the object itself (does this sound somewhat familiar?). The
reason
this is not done is that it would be much slower than the simple
implementation
to store just the address.

>    You always have a memory address even you don't want to have one.
>    This address will identify your object even if you don't want it.
> 
> See, here is the mistake.  If you don't have the concept of identity,
> then the compiler can choose to duplicate things when it needs to, it
> can keep things in registers instead of bundling them up and allocate
> them on the heap, it can use a hash-consing GC that identifies
> lookalikes the programmer didn't think of and represent them by the
> same "pointer" internally, and so on.  Once you do any of this, you
> don't have _one_ pointer that identifies your value -- there might be
> two, or ten, or none at all.
> 

But the compiler still has to keep track of which things belong to the
same
object - i.e. the object identity.
And C++ compilers also are allowed to optimize this way - storing values
in
registers is a common optimization technique in C++ compilers. And all
objects - including pointers - may be optimized away if the compiler
can prove that it doesn't need them to be there. And different variables
may
be stored in the same place under the as-if rule as well, if the
compiler
can proof that it won't change the program's behaviour. Under the as-if
rule,
pointers to different objects may even have the same physical value, if
the
compiler can proof that this won't change program behaviour.

I think the problem is just the (unfortunately very common)
misunderstanding
that a pointer *is* a memory address - it's not more of a memory address
than
the letter 'A' is the number 65: it's how it is (usually) implemented.

A pointer is an abstraction, just as a letter or a number is an
abstraction.
A pointer is the abstraction of the concept "that one".

<Prev in Thread] Current Thread [Next in Thread>