Syntax: A Casual and Formal Look
--+! This section is incomplete. Proceed with caution.
-
Call Syntax
+There is little difference between a function, macro, and operator call. There are only a few forms such calls can take, too, though notably more than most other languages (due to, among other things, uniform function call syntax): hence this section.
+# The standard, unambiguous call.
+routine(1, 2, 3, 4)
+# The method call syntax equivalent.
+1.routine(2, 3, 4)
+# A block-based call. This is only really useful for macros taking in a body.
+routine
+ 1
+ 2
+ 3
+ 4
+# A parentheses-less call. This is only really useful for `print` and `dbg`.
+# Only valid at the start of a line.
+routine 1, 2, 3, 4
+
+Binary operators have some special rules.
+# Valid call syntaxes for binary operators. What can constitute a binary
+# operator is constrained for parsing's sake. Whitespace is optional.
+1 + 2
+1+2
++ 1, 2 # Only valid at the start of a line. Also, don't do this.
++(1, 2)
+
+As do unary operators.
+# The standard call for unary operators. Postfix.
+1?
+?(1)
+
+Method call syntax has a number of advantages: notably that it can be chained: acting as a natural pipe operator. Redundant parenthesis can also be omitted.
+# The following statements are equivalent:
+foo.bar.baz
+foo().bar().baz()
+baz(bar(foo))
+baz
+ bar
+ foo
+baz bar(foo)
+baz foo.bar
+
+Indentation Rules
+The tokens =
, then
, do
, of
, else
, block
, const
, block X
, and X
(where X
is an identifier) are scope tokens. They denote a new scope for their associated expressions (functions/macros/declarations, control flow, loops). The tokens ,
, .
(notably not ...
), and all default binary operators (notably not not
) are continuation tokens. An expression beginning or ending in one of them would always be a syntactic error.
Line breaks are treated as the end of a statement, with several exceptions.
+pub func foo() =
+ print "Hello, world!"
+ print "This is from a function."
+
+pub func inline_decl() = print "Hello, world!"
+
+Indented lines following a line ending in a scope token are treated as belonging to a new scope. That is, indented lines following a line ending in a scope token form the body of the expression associated with the scope token.
+Indentation is not obligatory after a scope token. However, this necessarily constrains the body of the associated expression to one line: no lines following will be treated as an extension of the body, only the expression associated with the original scope token. (This may change in the future.)
+pub func foo(really_long_parameter: ReallyLongType,
+another_really_long_parameter: AnotherReallyLongType) = # no indentation! this is ok
+ print really_long_parameter # this line is indented relative to the first line
+ print really_long_type
+
+Lines following a line ending in a continuation token (and, additionally not
and (
) are treated as a continuation of that line and can have any level of indentation (even negative). If they end in a scope token, however, the following lines must be indented relative to the indentation of the previous line.
let really_long_parameter: ReallyLongType = ...
+let another_really_long_parameter: AnotherReallyLongType = ...
+
+really_long_parameter
+ .foo(another_really_long_parameter) # some indentation! this is ok
+
+Lines beginning in a continuation token (and, additionally )
), too, are treated as a continuation of the previous line and can have any level of indentation. If they end in a scope token, the following lines must be indented relative to the indentation of the previous line.
pub func foo() =
+ print "Hello, world!"
+pub func bar() = # this line is no longer in the above scope.
+ print "Another function declaration."
+
+Dedented lines not beginning or ending with a continuation token are treated as no longer in the previous scope, returning to the scope of the according indentation level.
+if cond then this
+else that
+
+match cond
+of this then ...
+of that then ...
+
+A line beginning with a scope token is treated as attached to the previous expression.
+# Technically allowed. Please don't do this.
+let foo
+= ...
+
+if cond then if cond then this
+else that
+
+for i
+in iterable
+do ...
+
+match foo of this then ...
+of that then ...
+
+match foo of this
+then ...
+of that then ...
+
+This can lead to some ugly possibilities for formatting that are best avoided.
+# Much preferred.
+
+let foo =
+ ...
+let foo = ...
+
+if cond then
+ if cond then
+ this
+else that
+if cond then
+ if cond then this
+else that
+
+for i in iterable do
+ ...
+for i in iterable do ...
+
+match foo
+of this then ...
+of that then ...
+
+The indentation rules are complex, but the effect is such that long statements can be broken almost anywhere.
+Expression Rules
+First, a word on the distinction between expressions and statements. Expressions return a value. Statements do not. That is all.
+There are some syntactic constructs unambiguously recognizable as statements: all declarations, modules, and use
statements. There are no syntactic constructs unambiguously recognizable as expressions. As calls returning void
are treated as statements, and expressions that return a type could possibly return void
, there is no explicit distinction between expressions and statements made in the parser: or anywhere before type-checking.
Expressions can go almost anywhere. Our indentation rules above allow for it.
+# Some different formulations of valid expressions.
+
+if cond then
+ this
+else
+ that
+
+if cond then this
+else that
+
+if cond
+then this
+else that
+
+if cond then this else that
+
+let foo =
+ if cond then
+ this
+ else
+ that
+
+# Some different formulations of *invalid* expressions.
+# These primarily break the rule that everything following a scope token
+# (ex. `=`, `do`, `then`) not at the end of the line must be self-contained.
+
+let foo = if cond then
+ this
+ else
+ that
+
+let foo = if cond then this
+ else that
+
+let foo = if cond then this
+else that
+
+# todo: how to handle this?
+if cond then if cond then that
+else that
+
+# shrimple
+if cond then
+ if cond then that
+else that
+
+# this should be ok
+if cond then this
+else that
+
+match foo of
+this then ...
+of that then ...
+
Reserved Keywords
The following keywords are reserved:
- variables:
let
var
const
- - control flow:
if
elif
else
+ - control flow:
if
then
elif
else
- pattern matching:
match
of
- - loops:
loop
while
for
in
- - blocks:
block
break
continue
return
- - functions:
func
mut
static
varargs
+ - error handling:
try
with
finally
+ - loops:
while
do
for
in
+ - blocks:
loop
block
break
continue
return
- modules:
pub
mod
use
as
- - error handling:
try
catch
finally
+ - functions:
func
varargs
- metaprogramming:
macro
quote
when
- - types:
type
distinct
ref
- - types:
struct
tuple
union
enum
interface
- - reserved: +
- ownership:
lent
mut
ref
refc
+ - types:
type
struct
tuple
union
enum
class
+
The following keywords are not reserved, but liable to become so.
-
-
impl
object
class
concept
auto
empty
effect
case
-suspend
resume
spawn
pool
thread
closure
+impl
object
interface
concept
auto
effect
case
+suspend
resume
spawn
pool
thread
closure
static
cyclic
acyclic
sink
move
destroy
copy
trace
deepcopy
The following identifiers are in use by the standard prelude:
- logic:
not
and
or
xor
shl
shr
div
mod
rem
- logic:
+
-
*
/
<
>
<=
>=
==
!=
is
- async:
async
await
- - types:
int
uint
float
+ - types:
int
uint
float
i\d+
u\d+
-
-
i8
i16
i32
i64
i128
-u8
u16
u32
u64
u128
f32
f64
f128
dec64
dec128
=
(assignment).
(chaining)
-,
(params)
+,
(parameters);
(statements):
(types)#
(comment)
+@
(attributes)_
(unused bindings)|
(generics)\
(string/char escaping)
-()
(params, tuples)
-{}
(scope, structs)
+()
(parameters, tuples)[]
(generics, lists)
-""
(strings)
+{}
(scope, structs)
+""
(strings)''
(chars)``
(unquoting)
-- unused:
~
@
$
%
+ - unused on qwerty:
~
%
^
$
+-
+
- perhaps leave
$
unused. but~
,%
, and^
totally could be...
+
- perhaps leave
A Formal Grammar
-We now shall take a look at a more formal description of Puck's syntax.
+We now shall take a look at a more formal description of Puck's syntax.
Syntax rules are described in extended Backus–Naur form (EBNF): however, most rules surrounding whitespace, and scope, and line breaks, are modified to how they would appear after a lexing step.
Identifiers
Ident ::= (Letter | '_') (Letter | Digit | '_')* @@ -258,81 +435,84 @@ HexDigit ::= Digit | 'A'..'F' | 'a'..'f'
CHAR ::= '\'' (PRINT - '\'' | '\\\'')* '\'' STRING ::= SINGLE_LINE_STRING | MULTI_LINE_STRING COMMENT ::= SINGLE_LINE_COMMENT | MULTI_LINE_COMMENT | EXPRESSION_COMMENT -SINGLE_LINE_STRING ::= '"' (PRINT - '"' | '\\"')* '"' -MULTI_LINE_STRING ::= '"""' (PRINT | '\n' | '\r')* '"""' +SINGLE_LINE_STRING ::= '"' (PRINT - '"' | '\\"')* '"' +MULTI_LINE_STRING ::= '"""' (PRINT | '\n' | '\r')* '"""' SINGLE_LINE_COMMENT ::= '#' PRINT* MULTI_LINE_COMMENT ::= '#[' (PRINT | '\n' | '\r' | MULTI_LINE_COMMENT)* ']#' EXPRESSION_COMMENT ::= '#;' SINGLE_STMT PRINT ::= LETTER | DIGIT | OPR | - '"' | '#' | "'" | '(' | ')' | # notably the dual of OPR + '"' | '#' | "'" | '(' | ')' | # notably the dual of OPR ',' | ';' | '[' | ']' | '_' | '`' | '{' | '}' | ' ' | '\t'
Values
Value ::= Int | Float | String | Char | Array | Tuple | Struct Array ::= '[' (Expr (',' Expr)*)? ']' -Tuple ::= '(' (Ident ':')? Expr (',' (Ident ':')? Expr)* ')' -Struct ::= '{' Ident ':' Expr (',' Ident ':' Expr)* '}' +Tuple ::= '(' (Ident '=')? Expr (',' (Ident '=')? Expr)* ')' +Struct ::= '{' Ident '=' Expr (',' Ident '=' Expr)* '}'
Variables
Decl ::= Let | Var | Const | Func | Type -Let ::= 'let' Pattern Annotation? '=' Expr -Var ::= 'var' Pattern Annotation? ('=' Expr)? -Const ::= 'pub'? 'const' Pattern Annotation? '=' Expr -Pattern ::= Char | String | Number | Float | Ident | '(' Pattern (',' Pattern)* ')' - Ident '(' Pattern (',' Pattern)* ')' +Let ::= 'let' Pattern (':' Type)? '=' Expr +Var ::= 'var' Pattern (':' Type)? ('=' Expr)? +Const ::= 'pub'? 'const' Pattern (':' Type)? '=' Expr +Pattern ::= (Ident ('as' Ident)?) | Char | String | Number | Float | + Ident? '(' Pattern (',' Pattern)* ')'
Declarations
-Func ::= 'pub'? 'func' Ident Generics? Parameters? Annotation? '=' Body -Macro ::= 'pub'? 'macro' Ident Generics? Parameters? Annotation? '=' Body -Generics ::= '[' Ident Annotation? (',' Ident Annotation?)* ']' -Parameters ::= '(' Ident Annotation? (',' Ident Annotation?)* ')' -Annotation ::= ':' Type +
+Func ::= 'pub'? 'func' Ident Generics? Parameters? (':' Type)? '=' Body +Macro ::= 'pub'? 'macro' Ident Generics? Parameters? (':' Type)? '=' Body +Generics ::= '[' Ident (':' Type)? (',' Ident (':' Type)?)* ']' +Parameters ::= '(' Ident (':' Type)? (',' Ident (':' Type)?)* ')'
All arguments to functions must have a type. This is resolved at the semantic level, however. (Arguments to macros may lack types. This signifies a generic node.)
Types
TypeDecl ::= 'pub'? 'type' Ident Generics? '=' Type -Type ::= StructType | TupleType | EnumType | UnionType | Interface | - (('distinct' | 'ref' | 'ptr' | 'mut' | 'static') (Type | ('[' Type ']'))?) -StructType ::= 'struct' ('[' Ident ':' Type (',' Ident ':' Type)* ']')? -UnionType ::= 'union' ('[' Ident ':' Type (',' Ident ':' Type)* ']')? -TupleType ::= 'tuple' ('[' (Ident ':')? Type (',' (Ident ':')? Type)* ']')? -EnumType ::= 'enum' ('[' Ident ('=' Expr)? (',' Ident ('=' Expr)?)* ']')? -Interface ::= 'interface' ('[' Signature (',' Signature)* ']')? -Signature ::= Ident Generics? ('(' Type (',' Type)* ')')? Annotation? +Type ::= TypeStruct | TypeTuple | TypeEnum | TypeUnion | SugarUnion | + TypeClass | (Modifier* (Type | ('[' Type ']'))) +TypeStruct ::= 'struct' ('[' Ident ':' Type (',' Ident ':' Type)* ']')? +TypeUnion ::= 'union' ('[' Ident ':' Type (',' Ident ':' Type)* ']')? +SugarUnion ::= '(' Ident ':' Type (',' Ident ':' Type)* ')' +TypeTuple ::= 'tuple' ('[' (Ident ':')? Type (',' (Ident ':')? Type)* ']')? +TypeEnum ::= 'enum' ('[' Ident ('=' Expr)? (',' Ident ('=' Expr)?)* ']')? +TypeClass ::= 'class' ('[' Signature (',' Signature)* ']')? +Modifier ::= 'ref' | 'refc' | 'ptr' | 'lent' | 'mut' | 'const' +Signature ::= Ident Generics? ('(' Type (',' Type)* ')')? (':' Type)?
Control Flow
-If ::= 'if' Expr ':' Body ('elif' Expr ':' Body)* ('else' ':' Body)? -When ::= 'when' Expr ':' Body ('elif' Expr ':' Body)* ('else' ':' Body)? -Try ::= 'try' ':' Body - ('except' Ident ('as' Ident)? (',' Ident ('as' Ident)?)*) ':' Body)* - ('finally' ':' Body)? -Match ::= 'match' Expr ('of' Pattern (',' Pattern)* ('where' Expr)? ':' Body)+ -Block ::= 'block' Ident? ':' Body -Block ::= 'static' ':' Body -Loop ::= 'loop' ':' Body -While ::= 'while' Expr ':' Body -For ::= 'for' Pattern 'in' Expr Body +
If ::= 'if' Expr 'then' Body ('elif' Expr 'then' Body)* ('else' Body)? +When ::= 'when' Expr 'then' Body ('elif' Expr 'then' Body)* ('else' Body)? +Try ::= 'try' Body + ('except' Ident ('as' Ident)? (',' Ident ('as' Ident)?)*) 'then' Body)+ + ('finally' Body)? +Match ::= 'match' Expr ('of' Pattern (',' Pattern)* ('where' Expr)? 'then' Body)+ +While ::= 'while' Expr 'do' Body +For ::= 'for' Pattern 'in' Expr 'do' Body +Loop ::= 'loop' Body +Block ::= 'block' Ident? Body +Const ::= 'const' Body +Quote ::= 'quote' QuoteBody
Modules
-Mod ::= 'pub'? 'mod' Ident ':' Body -Use ::= 'use' Ident ('/' Ident)* ('/' ('[' Ident (',' Ident)* ']'))? +
Mod ::= 'pub'? 'mod' Ident '=' Body +Use ::= 'use' Ident ('.' Ident)* ('.' ('[' Ident (',' Ident)* ']'))?
Operators
Operator ::= 'and' | 'or' | 'not' | 'xor' | 'shl' | 'shr' | - 'div' | 'mod' | 'rem' | 'is' | 'in' | - Opr+ + 'div' | 'mod' | 'rem' | 'is' | 'in' | Opr+ Opr ::= '=' | '+' | '-' | '*' | '/' | '<' | '>' | '@' | '$' | '~' | '&' | '%' | '|' | '!' | '?' | '^' | '.' | ':' | '\\'
Calls and Expressions
+This section is (quite) inaccurate due to complexities with respect to significant indentation. Heed caution.
Call ::= Ident ('[' Call (',' Call)* ']')? ('(' (Ident '=')? Call (',' (Ident '=')? Call)* ')')? | Ident Call (',' Call)* | Call Operator Call? | - Call ':' Body -Expr ::= Let | Var | Const | Func | Type | Mod | Use | Block | Static | - For | While | Loop | If | When | Try | Match | Call -Body ::= Expr | ('{' Expr (';' Expr)* '}') + Call Body +Stmt ::= Let | Var | Const | Func | Type | Mod | Use | Expr +Expr ::= Block | Const | For | While | Loop | If | When | Try | Match | Call +Body ::= (Stmt ';')* Expr
References:
@@ -372,22 +552,6 @@ Body ::= Expr | ('{' Expr (';' Expr)* '}')