[Grace-core] Name resolution in Grace
Michael Homer
mwh at ecs.vuw.ac.nz
Fri Jul 5 11:35:44 PDT 2013
On Sat, Jul 6, 2013 at 2:41 AM, Andrew P. Black <black at cs.pdx.edu> wrote:
> We have been having a discussion about name resolution in Grace
>
> Method request with explicit receivers (target.method), including operator requests (target + arg), are simple: the named method is requested on the named target.
>
> Method requests with arguments (method(arg)) and an implicit receiver may mean either outer.method(arg) or self.method(arg). How can we tell? At one time we said that this was ambiguous and therefore illegal, but I believe that this is untenable. This is because one may be both in a dialect (which defines an outer object) and inheriting from a super object. Because one can't change the names of methods in either the dialect or the superobject, such conflicts will sometimes occur, and the programmer can't do anything about them. I think that the right answer is to request the outer method, if there is one, otherwise to make a self request. If programmers wants the self request, they can write self.method(arg).
It is untenable because method names are part of the interface of an
object and so clearly can't be banned. Forget about inheritance and
anything else.
> The complicated case is a simple identifier like x or foo. This may be:
> (1) a lexically-bound reference to a variable, definition, or parameter; or
> (2) a dynamically-bound method request to an object that is self, or outer, or outer.outer, etc.
>
> I propose that x should in these cases always be lexically-bound, if a lexical-binding exists.
> We can tell if a lexical binding exists; that's the nature of lexical bindings. To this end, names in an "enclosing" dialect should not be in the lexical scope of entities written in the dialect. (This was Martin Odersky's suggestion — lexical means within the current compilation unit, and thus visible on the screen in front of me.)
>
> If the programmer intends to make a request, but the name of the method to be requested clashes with a lexically-bound name, then the programmer can get the desired behavior by using an explicit self or outer.
>
> If there is no lexical binding for x, then x should be treated as a request on outer or on self. Does the lookup for an appropriate method go "out then up" or "up then out"? We have discussed this before; the problem is that I don't remember the answer, and I'm concerned that nor will our students. One can make a good argument for either rule, I think. Up-then-out is consistent with normal overriding, whereas out then up allows dialects to be defined by inheriting from other dialects. Oops — we just banned inheritance from dialects, so maybe that doesn't matter any more.
You have never been able to inherit (directly) from a dialect, so I
don't know what you mean by "just".
> The apparently-simple rule that if there is an ambiguity between up-then-out and out-then-up doesn't work either, because a change to a super-object, or to a dialect, neither of which can I control, can invalidate code of mine that formerly compiled. Still, such a compilation failure is probably better than having the behaviour of the code silently change!
>
> Now I will try to write down clearly the meaning of x:
>
> — if x is a variable, definition, or parameter in the current lexical scope, then x refers to that variable, definition, or parameter. ("Current lexical scope" excludes the enclosing dialect, since names defined there are not actually visible "on the screen".)
>
> — if x is a method defined in the current object, or in any of its super-objects, then x means the method request self.x
>
> — if x is a method defined on the enclosingn object (including inherited methods), then x means outern.x. Thus, if x is a method defined on the enclosing object (including inherited methods), then x means outer.x; if x is a method defined on the enclosing enclosing object (including inherited methods), then x means outer.outer.x, etc.
>
> — if more than one of the above applies, then x is ambiguous, and the programme is illegal. The programmer has to fix it, either by adding explicit selfs or outers, or by renaming the variables in the enclosing objects.
I think this is too complicated (jumping back and forth) and they
should in general just be disambiguated by the compiler. We can't
avoid fragile base classes, or fragile dialects, and it's not worth
making this complex of a rule to have to keep track of without
actually fixing anything.
How about:
1) Look at names in the current scope (i.e., names that would be
available if this scope existed without any surrounds). If a matching
name exists, resolve to it and terminate.
2) Go out one scope, and forget about the scope we were in before.
3) Go to 1.
"Scope", here, means any bounded contiguous location where new names
can be bound - method bodies, blocks, objects, classes.
So inside a method, local variables/parameters come first, because
they're closest, then self calls, then locals in the surrounding
scope/"self"-calls on the surrounding object, and so on. Then there is
really only a single rule and no ordering.
If you're at an object scope, you try adding "self." to the front and
see if it would work, while in a method or block you use the name as
it is, and you keep going until you find it. It's a simple enough rule
to explain and avoids "jumping" around in the program or making
dialect lookup extra magical.
A dialect could still enforce whatever variant of the no-shadowing
rule it liked, of course, even the no-method-conflicts version.
I don't think unresolved names should be late-bound on self - that way
lies dynamic scoping madness, only worse.
-Michael
More information about the Grace-core
mailing list