[Grace-core] Some comments from Tijs on parsing Grace
Tijs van der Storm
storm at cwi.nl
Mon Jul 25 02:41:34 PDT 2016
Hi all,
Interesting discussion! I'm not much of types person, but I guess (/would
propose) that in any case runtime execution should never depend on types.
I'm pretty sure you're aiming for this (seeing the discussion above), but
(if it's actually the case) it would be good to make this *very* explicit
in the spec.
Regarding the "syntactic vinegar", I wonder why this syntax was chosen in
the first place. Is it really unparseable with single [], or even <>? For
instance, as far as I can see, the only place where [] are used are
lineups. You could require at least 1 space to occur before a lineup, and
disallow a space before the [] actual generic type arg. I assume type args
on call sites will always be after the selector name (or the first keyword
component). This would work with both prefix ops -[Int] 3, infix ops
1 +[Int,Int] 2, and keyword messages if[Bool,Block,Block] () then {} else
{}.
James and I also had some discussion of the offside rule. I have an
implementation of it, that seems to make sense. Below you'll find an
excerpt of my test suite; perhaps they can be added to a/the central grace
test suite?
Basically these functions parse a CodeSequence using parse code, containing
a single or multiple statements, and then asserts the number of statements
is some value (isExpr(...) is equal to countStats(...) == 1). You can
ignore the apostrophes, that's just Rascal's way of allowing nice
multi-line strings.
To summarize the rules as I've implemented them as post-parse filters:
- Reject a Code statement that contains any syntactic element below it with
the same or less indentation, *except* the closing curly }.
- Reject a Code declaration that contains any syntactic element below it
with the same or less indentation, *except* the closing curly }.
- Reject a CodeSequence where the constituent Code elements are on the same
line, unless the left element ends with a semicolon.
- Reject an Expression which is binaryOtherOp where the lhs is also a
binaryOtherOp but its operator is not the same as the current operator.
One source of contention was zigzag, which failed in one implementation,
but was ok in another (@James, please correct me if I'm wrong).
You can find Rascal's grace grammar here:
https://github.com/cwi-swat/grace-grammar/blob/offside2/src/DynGrace.rsc
Disambiguation starts at line 273, which is ugly, verbose and (too) slow
now, -- we'll just have to wait until our data dependent parsing stuff gets
integrated into Rascal :-D
I've also attached a screenshot showing how nested methods, and defs/vars
in traits are treated as static errors, rather than semantic errors.
Cheers!
Tijs
test bool keywordMessage0()
= isExpr(parseCode("if (x) then {y}
' else {z}"));
test bool keywordMessage01()
= isExpr(parseCode("if (x) then {y} else
' {z}"));
test bool zigzag()
= isExpr(parseCode("if (x)
' then {foo}
' elsif (x)
' then {bar}
' elsif (x)
' then {baz}"));
test bool keywordMessageBla()
= isExpr(parseCode("match 1
' case 2
' do (x)
' case 3
' do (x)"));
test bool keywordMessage1()
= isExpr(parseCode("if (x) then {y} else {z}"));
test bool keywordMessage2()
= isExpr(parseCode("if (x)
' then {y}
' else {z}"));
test bool keywordMessage3()
= isExpr(parseCode("if (x)
' then {y}
' else {z}"));
test bool keywordMessage4()
= countStats(parseCode("if (x) then {y} else {z}")) == 1;
Tree pw() = parseCode("if (x)
'then {y}
'else {z}");
test bool keywordMessage5()
= countStats(parseCode("if (x)
'then {y}
'else {z}")) == 3;
test bool keywordMessage6()
= countStats(parseCode("if (x)
' then {y}
'else {z}")) == 2;
test bool keywordMessage7()
= isExpr(parseCode("if (x) then {
' y
'} else {
' z
'}"));
test bool keywordMessage7NoCurlies()
= isExpr(parseCode("if (x) then
' 3
' else
' 4"));
test bool semicolonSingleLine()
= countStats(parseCode("a; b")) == 2;
test bool semicolonMultiLine()
= countStats(parseCode("a;\n b")) == 2;
test bool semicolonMultiLineStat()
= countStats(parseCode("if (x) then
' 3
' else
' 4; x")) == 2;
test bool curlyOpenOnNextLine()
= countStats(parseCode("foo\n {x}")) == 1;
test bool otherOp1()
= isExpr(parseCode("1 ++ 2"));
test bool otherOp2()
= isExpr(parseCode("1
' ++ 2"));
test bool otherOp3()
= countStats(parseCode("1
'++ 2")) == 2;
test bool otherOp3()
= countStats(parseCode("1 ++
' 2")) == 1;
test bool otherOp4()
= countStats(parseCode("1 ++
' 2 ++
' 3")) == 1;
test bool otherOp5()
= countStats(parseCode("1 ++
' 2
'++ 3")) == 2;
test bool otherOp6()
= countStats(parseCode("1
' ++ 2
' ++ 3")) == 1;
test bool plusOp1()
= isExpr(parseCode("1 + 2"));
test bool plusOp2()
= isExpr(parseCode("1
' + 2"));
// is there no builtin unary plus?
//test bool plusOp3()
// = countStats(parseCode("1
// '+ 2")) == 2;
test bool plusOp3()
= countStats(parseCode("1 +
' 2")) == 1;
test bool otherOpPrecedence1()
= isLeftAssoc(parseExp("1 ++ 2 ++ 3"));
test bool otherOpPrecedence2()
= expectParseError(() { parseExp("1 -- 2 ++ 3"); });
test bool otherOpParens()
= expectParseError(() {
parseCode("(x
' ++ y)");
});
// should succeed
test bool someVar1()
= countStats(parseCode("var x := y
' ++ z")) == 1;
test bool someVar2()
= countStats(parseCode("var x := y
' ++ z")) == 1;
test bool prefixOp1()
= isExpr(parseCode("++
' 3"));
test bool prefixOp2()
= isExpr(parseCode("++
' 3"));
test bool prefixOp3()
= countStats(parseCode("a
'++
' 3")) == 2;
On Mon, Jul 25, 2016 at 10:53 AM James Noble <kjx at ecs.vuw.ac.nz> wrote:
> > It’s simple. If you want explicit type parameters, you put them in,
> using the ordinary parameter syntax and specifying that their type is
> Type. If you don’t want explicit type parameters, you do what is shown in
> the pdf extract .
>
> ahh OK, yep. I did remember you telling me that, the excerpt from the
> manual
> didn't cover it that's all
>
> > In Emerald, the number of arguments must always be the same as the
> number of parameters. There are no defaults.
>
> other than these "implicits", But Emerald wasn't "gradually typed".
>
>
> On reflection - I wonder if I was (somewhat) confusing things in my last
> comments. There is a (big) difference between a dialect inferring types or
> type parameters and using them (locally, statically) to check things. and
> the
> dialect (if it could) or core compiler (which can) using inference to
> (globally, dynamically) populate the reified generic parameters.
>
> The catch is that if these two things are different we're back to Java
> generic erasure, basically.
>
> This is tricky, and I don't think we've thought through it enough.
> I fear no-one has. We should ask Jeremy & Ron again.
>
> james
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailhost.cecs.pdx.edu/pipermail/grace-core/attachments/20160725/8cf71918/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2016-07-25 at 11.28.14.pdf
Type: application/pdf
Size: 132503 bytes
Desc: not available
URL: <http://mailhost.cecs.pdx.edu/pipermail/grace-core/attachments/20160725/8cf71918/attachment-0001.pdf>
More information about the Grace-core
mailing list