[Grace-core] Minutes of Teleconference 2011.10.16

Tue Nov 22 03:29:27 PST 2011

content
 * an intensive discussion of method request resolution --- nesting vs inheritance.
      I've attempted to write up a (long, detailed) version of the discussion, it's attached below.
    * the key result was adopting an "up OR out" rule for implicit receiver request resolution
    * plus allowing language levels to require an explicit "self.m" for inherited requests  (via an optional static checker)
 * we also talked briefly about library extensions --- e.g. adding a method into all Strings

next meeting
- Wed16 3pm  US
- Thu midday NZ
 * main topic to discuss: module systems! 
 * james will attempt to synthesize something before the meeting.

"Up OR Out: Inheritance, Nesting, and Andrew's GedankenSprache"

One of the issues we've been putting off tackling is the question of the
interaction between lexical scope (or nested objects) on one side, and
inheritance on the other. Probably the best description of the problem
is Gilad Bracha's discussion [DYLA2007] where he outlines the problem
and surveys a range of solutions.

The problem is, in a program like this:

<blockquote>
class SuperClass {
  def m := "in superclass. "
} 

def out := object { 
  def m := "in enclosing object. " 
  def inner := object extends SuperClass {
    method foo { print (m) }  
  }
}
</blockquote>

which version of "m" is bound in the method "foo": the one from the
enclosing class, or the one from the superclass? In Java, the
superclasses' definition would be found, in Newspeak, the enclosing
class's definition. Gilad's paper lays out a number of more complex
cases (what if "m" isn't defined in "out" but rather in out's
superclass) that certainly can get quite hairy.  Erik Ernst's gBETA
uses the "comb" rule meaning that definitions may be in superclasses
and their enclosing objects, while Newspeak's rule is "out then up"
--- definitions may be directly enclosing objects, or superclasses,
and that's it.  Kim pointed out one odd side effect of the Newspeak
rule is that code in an enclosing object may use a method from its
superclass, but code inside a nested object may not:

<blockquote>
class SuperClass {
// Grace syntax, Newspeak semantics 

  def m := "in out's superclass"
  method test { print(m) } // m OK here
} 

def out := object  extends SuperClass { 
  method test { print(m) } // m OK here
  def inner := object {
    method foo { print (m) }  // m not found here
  }
}
</blockquote>

In Java, all three calls of "m" would be legal.

AmbientTalk (http://soft.vub.ac.be/amop/at/tutorial/multiparadigm)
gives another option: making a "caller side" distinction between
lexical nesting and inheritance. Requests with explicit self (as in
"self.m") consider only inheritance chain; requests with an implicit
self (just "m") consider only nesting.  In AmbientTalk, methods and
fields are considered nested within their enclosing object, and so
within a class defining "m", "m" can be accessed in two ways: either
lexically (just "m") or dynamically "self.m".  The difference between
these two is only visible in subclasses: lexical references are, well,
bound lexically, so a request "m" will only ever invoke the "m" defined in
the same object, whereas "self.m" will invoke the subclasses'
definitions:

<blockquote>
// Grace syntax, AmbientTalk semantics

class SuperClass {
  def m := "in superclass."
  method test { print(m ++ self.m) } 
} 

class SubClass extends SuperClass {
  def m := "in subclass."
}

SuperClass.new.test // prints in superclass. in superclass. 
SubClass.new.test   // prints in superclass. in subclass.
</blockquote>

One of the design principles for Grace is to have only "one way of
doing anything": the fine distinction between "m" and "self.m" means
there will be two ways for a method to call other methods (or access
fields) defined with the same class.

So, what should Grace do?

We've already made some decisions: Grace will support inheritance, so
subclass definitions can override superclass definitions. Grace will
not support lexical shadowing -- a inner nested class cannot override
its enclosing class's definitions. But what should happen when a request,
like "m", may be satisfied both via nesting and via inheritance?

Andrew's GedankenSprache

At this point, Andrew proposed his GedankenSprache --- a design
embodying one extreme design point. The GedankenSprache makes a very
strong distinction between lexical binding and inheritance, at both
definition and use. Variables, constant definitions, and functions are
always lexically bound, are invoked without without any receiver
keyword, and are always bound to the nearest enclosing definition. In
contrast, methods are dynamically bound, and are always requested with
the "self" keyword:

<blockquote>
// GedankenSprache syntax and semantics

class SuperClass {
  function f { "function in superclass. " }
  method m   { "method in superclass. " }
} 

def out := object { 
  function f { "function in enclosing object. " }
  method m   { "method in enclosing object. " }

  def inner := object extends SuperClass {

    method test {
       f       // prints function in enclosing object. 
       self.f  // no such method error
       m       // no such function error
       self.f  // prints method in superclass. 
    }
  }
}
</blockquote>

The GedankenSprache design has some very nice properties: it's
unambiguous, conceptually clear, and highlights the distinction
between lexical and dynamic binding. The evaluation rule (or rules)
are very straightforward and explicable: functions are lexically bound
by "going out" the nesting hierarchy while methods are bound "going
up" the inheritance hierarchy.  Many OO languages follow some or all
of this design: Smalltalk and Python require explicit receivers on
_all_ method requests, while OCaml and Go make a similar distinction
between method and function definitions.  The AmbientTalk distinction
between lexical binding "m" and dynamic binding "self.m" is avoided
because any method or feature can only be accessed one kind of binding.

Unfortunately, the GedankenSprache design has some issues too.
Conceptually, the strong linguistic distinction between "method" and
"function" can be taken as implying a strong conceptual distinction.
The Scandinavian school of object-orientation has a strong ontological
argument against this distinction: SIMULA considers objects as
retained procedure activations, and BETA unifies objects and methods.
In the GedankenSparche, however, programmers would need to choose when
to use functions and when to use methods, and teachers would need to
explain this distinction.  Moving code from lexical to object scope
would mean changing all the "function" keywords and (receiverless)
function calls to be "method" keywords and method calls with an
explicit self --- quite possible with a refactoring browser but an
overhead nonetheless.  One of our original design principles (or
choices) for Grace was that there would be "one simple method request
rule" --- the GedankenSprache rules are very simple, but there are
really two evaluation rules.

More pragmatically, the GedankenSprache design makes code using
objects longer and uglier than code that doesn't use objects.
Compare:

<blockquote>
length := sqrt( dx*dx + dy*dy )
</blockquote>

with 

<blockquote>
self.length := sqrt( self.dx*self.dx + self.dy*self.dy )
</blockquote>

All those explicit "self"s don't seem to add much to the code (other
than allowing the language evaluation rules to be simplified, of
course).  Now there are other options than "self." to mark method
requests vs function calls (perhaps a prefix "@" sigil?)  but writing
"@dx * @dy" doesn't offer much more than brevity.  These are reasons
why Eiffel, Self, Java, C#, and Scala (to name a few) adopted
"implicit receiver" syntax for dynamically bound methods on self.  One
of our key goals for Grace is to be a model for object-oriented
languages: it seems important that object-oriented code in Grace looks
at least as good as non object-oriented code.  

The GedankenSprache does throw light on one important part of the
design space. Functions are bound lexically, just by going "out" the
nesting hierarchy --- resolving lexical invocations never involves the
inheritance hierarchy. Complementarily, methods are bound dynamically,
only by going "up" inheritance hierarchy, never involving the
inheritance hierarchy.  Newspeak's "up then out" rule is the
straightforward conjunction of these two rules.  The key point is that
the GedankenSprache resolves a lexically bound request using only
nesting, and a dynamically bound using only inheritance: it never acts
like the comb rule, using a combination of nesting and inheritance to
resolve a single request.

Design Choices

Reflecting on the GedankenSprache, we decided we liked the simplicity
of the resolution rules, and especially the lack of ambiguity between
lexical and dynamic binding.  On the other had we didn't like the
conceptual duplication, particularly not distinguishing definitions of
functions from the definitions of methods.

Following this line, it's relatively easy to imagine a language design
(heading towards AmbientTalk) where method and function definitions
are not distinguished syntactically, but where message requests ---
uses of those definitions --- are still distinguished: "m" is
lexically bound, while "self.m" is dynamically bound. (The distinction
needs to be this way around as long as core language features
themselves, such as "if" or "while" are found at the top of the
lexical tree.)  We can avoid the AmbientTalk ambiguity by just
preventing the same feature being resolved by more than one binding:
in particular, methods in the current class can should only be
accessed by dynamic binding rather than lexical binding.

This still leaves the aesthetic issue of having to write an explicit
"self.m" request to access object's methods and fields --- whether
defined in the current class or in superclasses.  The only way to
avoid this is to allow implicit receiver requests "m" to be resolved
by either lexical or by dynamic binding, somehow depending on context
--- but of course takes us back to the start of this discussion, with
the ambiguous cases between lexical binding for object nesting, and
dynamic binding for inheritance.

To deal with the ambiguity between inheritance and nesting, Kim
pointed to another option in Gilad's paper [DYLA2007] --- that a
potentially ambiguous method request (that could be bound either
lexically or dynamically) could raise an error, rather than having to
choose one or other interpretation. In terms of reading programs, this
has many of the advantages of the hard distinction made in the
Gedankensprache, in particular, that every implicit request in a
correct program can be resolved unambiguously.

The final piece of the puzzle is the (potentially disingenuous) option
of using Grace's support for language dialects to enforce the
GedankenSprache's distinction by requiring an explicit "self.m"
requests to access dynamically bound object fields.  Teachers adopting
an "objects first" approach probably will not enforce that rule, to
ensure object-oriented code is as lightweight as possible, while
teachers adopting a functional or procedural approach, or those
wanting to focus on the distinction between lexical and dynamic
resolution, can enforce the rule to clarify the language's semantics.

We could revisit our earlier decision to permit lexical shadowing in
the core language, in the same way, pushing the "no shadowing" rule
into the checker for particular dialects, perhaps requiring a
"shadows" annotation (to parallel inheritance's "overrides"
annotation). On balance, this seems to be a minor point, and the
current no-overriding rule seems to work well.

Resulting Grace Design

The resulting design is a compromise: an "up OR out" resolution rule
that forbids ambiguous implicit self requests, or rather requires them
to be disambiguated by an explicit "self.m" (for inheritance) or
"outer.m" (for nesting); and an optional dialect-specific restriction
requiring implicit sends to only resolve lexically.  This design
avoids the ambiguities of Newspeak or Java, and the conceptual
duplication of the GedankenSprache, at the cost of a resolution rule
that is more complex than the GedankenSprache version, but no more
complex than Newspeak and arguably simpler than Java or Beta. And
again, where we have been unable to find (or to agree upon) a single
preferred solution, the idea of dialects (language level) lets us have
our cake, and eat it too.

After these 1800 words (crystalising at least 90 minutes of
discussion), the resulting section of the specification would read:

 * An implicit receiver request "m" may resolve to a lexically enclosing identifier, or to an inherited identifier.
 * A ambiguous request that resolves to both an enclosing and an inherited identifier is a static error.
 * Declaring an identifier that shadows a lexically enclosing identifier is a static error.
 * Identifiers that override an inherited identifier must be annotated as "overrides"; failure to do so is a static error
 * Enclosing identifiers may be requested by an explicit "outer.m" request.
 * Inherited identifiers may be requested by an explicit "self.m" request.