[Grace-core] On Gradual Typing

Thu Jul 18 14:40:14 PDT 2013

Thanks for opening this discussion as it has been something I had hoped to get to by this point, but debugging my type-checker has taken more time than I'd like (I miss static typing!!!).  It looks like my reply will be the last in the group (I had hoped to respond yesterday, but spent 8 hours in the emergency room with my son after he had a bike accident -- lots of scrapes and a broken scapula after flying over the handlebars when someone turned in front of him).  I'm going to give my own account rather than replying to anyone else's (aside from Michael's original).

On Jul 17, 2013, at 2:56 AM, Michael Homer <mwh at ecs.vuw.ac.nz> wrote:

> So far, the conception of how gradual typing works in Grace has been
> that the typechecker is always there, but that by using "Dynamic" or
> omitting the annotations you can get dynamically typed code that the
> typechecker doesn't do anything with. I'm not sure all of that that
> has been explicitly noted anywhere, but it is how I understand it to
> be. The type annotations do not have any effect on the run-time
> behaviour, but simply allow rejecting programs that are invalid from a
> certain point of view at an early stage. Dynamic checks occur at the
> boundary between dynamic and static code when required.
> 
> I've mentioned before that I don't think all of this actually fits
> together, but interest has been low. There are three separate related
> issues with the gradual/static typing in the language that seem
> problematic to me, which are dealt with in three separate sections
> below beginning with the most fundamental. There is a so-what summary
> at the end.
> 
> Let us consider the following completely statically-typed program:
> 1    object {
> 2        method x -> Done {
> 3            self.y
> 4        }
> 5    }
> Line 3 is clearly a static type error because there is no such method
> "y" on self.
> 
> Now let us consider the following completely dynamically-typed program:
> 1    object {
> 2        method x {
> 3            self.y
> 4        }
> 5    }
> Line 3 is clearly a static type error because there is no such method
> "y" on self.

I would say that this is a static error, that should be caught at compile time in any Grace program, as it can in no circumstances be correct.  Thus, I personally would want an error (or at least a warning here) no matter how we are executing the program.

(My feelings about this are stronger now than a few weeks ago, now that I've been programming in Grace.  There are lots of these kinds of errors that slip by in my code, when my life would be much easier if they were flagged in some way.)
> 
> We have a problem here.
> 
> A key goal has been that control structures are actually methods
> inside. Let's look at a plausible definition of the for-do method:
> 1    method for(iterable)do(block) {
> 2        def iterator = iterable.iterator
> 3        while {iterator.hasNext} do {block.apply(iterator.next)}
> 4    }
> (It is possible that for-do delegates to an internal iterator method
> on the collection - that doesn't matter for these purposes).
> Of course, we want statically-typed code to stay statically-typed, so
> this method actually needs type annotations:
> 1    method for(iterable : Iterable)do(block : Block) {
> 2        def iterator = iterable.iterator
> 3        while {iterator.hasNext} do {block.apply(iterator.next)}
> 4    }
> And since collections will be generic, that actually needs to read:
> 1    method for(iterable : Iterable<E>)do(block : Block<E, Done>) {
> 2        def iterator : Iterator<E> = iterable.iterator
> 3        while {iterator.hasNext} do {block.apply(iterator.next)}
> 4    }
> But where did the E come from? So we really have:
> 1    method for<E>(iterable : Iterable<E>)do(block : Block<E, Done>) {
> 2        def iterator : Iterator<E> = iterable.iterator
> 3        while {iterator.hasNext} do {
> 4            block.apply(iterator.next)
> 5        }
> 6    }
> So far so good.
> 
> We want to write accesses to this method like control structures:
> 1    for (someList) do { v -> print(v) }
> From here we will have another problem.
> 
> In order to be able to write dynamically-typed code, omitted generic
> parameters must be populated with "Dynamic" (at runtime - we don't
> necessarily know statically what the type of the method is, so this,
> rather than static inference, must be the semantics of the language).
> Because we have retained generics that really are just additional
> formal parameters, this is actually important. So
> 1    for (someList) do { v -> print(v) }
> is a dynamically-typed request, and all is well.
> 
> In statically-typed code, though, what does that mean? If I write
> 1    for (someList) do { v : X -> print(v) }
> then that must (because we have gradual typing) mean the same thing as
> the dynamically-typed case, regardless of whether we know the type of
> the method. That rules out inference of the parameter. So in your
> statically-typed code, this loop is actually
> 1    def someList : List<Number> = generateNumberList(5)
> 2    for<Number> (someList) do { v : Number -> print(v) }
> This seems unwieldy, but perhaps it is acceptable to the people who
> are concerned with that. Internal iterators do not have this problem,
> although the general issue still stands.

I have more to say about blocks below, but I believe this is a case where local type inference can be effective (when in static checking mode).  Inferring the Number type parameter is easy given the type annotation on v.  
> 
> Now consider the last argument to the method, a block. What is its
> type? What is the type of a block in general? In one of the early
> examples, I wrote
> 1    method for(iterable : Iterable)do(block : Block) {
>     ...
> Is there even a "Block" type? The type of
> 1    { v : Number -> print(v) }
> seems like it must be roughly { apply(_ : Number) -> Done }. What about
> 2    { print "OK" }
> ? It must be { apply() -> Done }, while
> 3    { a : Number, b : Number -> print(a + b) }
> must be { apply(_ : Number, _ : Number) -> Done }. The common
> supertype of these types is {}, the top type, because the methods have
> different parameter counts. There seems to be no common "Block" type
> possible. Instead, there needs to be many "Block0", "Block1" types
> defined, or possibly variadic generics. In either case, these will all
> be totally different types.

I've been running into this problem as I try to write statically typed code.  Hence I actually have types Block0<R>, Block1<T,R>, Block2<T1,T2,R>, ...  In these the Ti are the types of the parameters to the block, while R is its return type.  I do consider each of these to be different types whose common supertype is Object.

I consider each occurrence of Block in the source code to be an abbreviation for one of Block0, Block1, ...  Occurrences of Block should be translated to one of those (internally) at parse time.  Programmers can either fill in the type parameters where necessary or try type inference.

Fortuitously, novices will only be using blocks as parameters to predefined methods like for, while, etc.  They will only be naming them or even declaring them as formal parameters in a second or later course.  As a result, they will virtually never need to worry about this issue.  However, I would like them to be informed when they have the wrong number of parameters!  (See below)
> 
> When I write
> 1    var sum := 0
> 2    for (1..5) { i -> sum := sum + i }
> I am clearly concerned with the actual values I am iterating over.
> What about if I simply want to repeat something?
> 1    for (1..5) { print "****************" }
> Firstly, is this type-correct? It seems that it cannot be, when there
> is no overarching Block type. Secondly - ignoring types - is it even
> semantically valid at runtime? Can I provide a block with excess
> arguments? People have been habitually writing code with both forms
> (it was even put in the specification), but it doesn't seem like they
> can both be well-typed.

This is statically incorrect, in my opinion, and should generate a warning or error in any typing regime -- static or dynamic.
> 
> Type inference of the return type of the block seems to have many of
> the same conceptual issues as other inference, but I don't think it
> affects runtime behaviour that isn't either already doing type
> matching or a gradual check. There doesn't seem to be any way to
> specify the type explicitly.

As I suggested above, you can specify it, but it can be painful, and thus hopefully can be inferred.
> 
> What can we take from all of this? Firstly, that the story saying that
> the typechecker is always there but is avoided by omitting type
> annotations is unsustainable. It is not possible to write entirely
> dynamically-typed code simply by leaving off type annotations, and a
> programmer expecting to do so will face unexpected type errors they
> are not prepared for. Tim has been talking about implementing the
> typechecker as a dialect for research purposes; I think that this is
> in fact the only way forward, in which the typechecker is an
> additional layer that may be bolted on for programs desiring static
> typing.
> 
> Secondly, that given all of this, anywhere that type inference is
> required is a significant problem. With the typechecker necessarily
> optional, it isn't possible for it to modify the behaviour of the
> program, but it would need to do so to allow inference to occur.
> Inferring out of blocks is one thing, but inferring generic parameters
> is simply not possible. Variants of the rule where inference only
> occurs for statically-typed code cause spooky action at a distance,
> where a new type annotation in one part of the program causes a
> behavioural change somewhere far away, and so aren't satisfactory
> either. So types must always be specified if they are to be used, or
> any inference that is utterly essential must be very narrowly
> circumscribed.

I agree that inference must be narrowly circumscribed and we should only do very local inference.  An interesting question is what type annotations should be silently inferred (e.g., "for" method) and which must be explicitly inserted (perhaps with the help of an inferencer).
> 
> Thirdly, that the run-time checks implied by gradual typing end up in
> an interesting place in this world. They presumably end up
> implementing a specific view of the types, but how that conforms to
> the view taken by the static checker is unclear. There seems to be a
> mode where there are no static checks, but fully-typed code will be
> checked dynamically at every step. No type inference is possible here.

I continue to be concerned about what will happen in practice with gradual typing.  While I'm working now solely on static type checking, once that is in hand, we can start moving toward solving the problems with gradual typing.
> 
> Finally, that exactly how blocks work needs some urgent consideration.
> They are pervasive but underspecified.

Hopefully my suggestions above help.

In the worst case, I'm prepared to fall back to a solution where each compilation unit is labelled to be either statically or dynamically checked, but that units of different flavors can be used together.  I hope that won't be necessary, but believe it would help with some of these problems.

> -Michael
> _______________________________________________
> Grace-core mailing list
> Grace-core at cecs.pdx.edu
> https://mailhost.cecs.pdx.edu/mailman/listinfo/grace-core