aboutsummaryrefslogtreecommitdiff
path: root/docs/SYNTAX.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/SYNTAX.md')
-rw-r--r--docs/SYNTAX.md277
1 files changed, 251 insertions, 26 deletions
diff --git a/docs/SYNTAX.md b/docs/SYNTAX.md
index 6561acb..4e57b04 100644
--- a/docs/SYNTAX.md
+++ b/docs/SYNTAX.md
@@ -1,6 +1,228 @@
# Syntax: A Casual and Formal Look
-> ! This section is **incomplete**. Proceed with caution.
+## Call Syntax
+
+There is little difference between a function, macro, and operator call. There are only a few forms such calls can take, too, though notably more than most other languages (due to, among other things, uniform function call syntax): hence this section.
+
+```
+# The standard, unambiguous call.
+routine(1, 2, 3, 4)
+# The method call syntax equivalent.
+1.routine(2, 3, 4)
+# A block-based call. This is only really useful for macros taking in a body.
+routine
+ 1
+ 2
+ 3
+ 4
+# A parentheses-less call. This is only really useful for `print` and `dbg`.
+# Only valid at the start of a line.
+routine 1, 2, 3, 4
+```
+
+Binary operators have some special rules.
+
+```
+# Valid call syntaxes for binary operators. What can constitute a binary
+# operator is constrained for parsing's sake. Whitespace is optional.
+1 + 2
+1+2
++ 1, 2 # Only valid at the start of a line. Also, don't do this.
++(1, 2)
+```
+
+As do unary operators.
+
+```
+# The standard call for unary operators. Postfix.
+1?
+?(1)
+```
+
+Method call syntax has a number of advantages: notably that it can be *chained*: acting as a natural pipe operator. Redundant parenthesis can also be omitted.
+
+```
+# The following statements are equivalent:
+foo.bar.baz
+foo().bar().baz()
+baz(bar(foo))
+baz
+ bar
+ foo
+baz bar(foo)
+baz foo.bar
+```
+
+## Indentation Rules
+
+The tokens `=`, `then`, `do`, `of`, `else`, `block`, `const`, `block X`, and `X` (where `X` is an identifier) are *scope tokens*. They denote a new scope for their associated expressions (functions/macros/declarations, control flow, loops). The tokens `,`, `.` (notably not `...`), and all default binary operators (notably not `not`) are *continuation tokens*. An expression beginning or ending in one of them would always be a syntactic error.
+
+Line breaks are treated as the end of a statement, with several exceptions.
+
+```puck
+pub func foo() =
+ print "Hello, world!"
+ print "This is from a function."
+
+pub func inline_decl() = print "Hello, world!"
+```
+
+Indented lines following a line ending in a *scope token* are treated as belonging to a new scope. That is, indented lines following a line ending in a scope token form the body of the expression associated with the scope token.
+
+Indentation is not obligatory after a scope token. However, this necessarily constrains the body of the associated expression to one line: no lines following will be treated as an extension of the body, only the expression associated with the original scope token. (This may change in the future.)
+
+```puck
+pub func foo(really_long_parameter: ReallyLongType,
+another_really_long_parameter: AnotherReallyLongType) = # no indentation! this is ok
+ print really_long_parameter # this line is indented relative to the first line
+ print really_long_type
+```
+
+Lines following a line ending in a *continuation token* (and, additionally `not` and `(`) are treated as a continuation of that line and can have any level of indentation (even negative). If they end in a scope token, however, the following lines must be indented relative to the indentation of the previous line.
+
+```puck
+let really_long_parameter: ReallyLongType = ...
+let another_really_long_parameter: AnotherReallyLongType = ...
+
+really_long_parameter
+ .foo(another_really_long_parameter) # some indentation! this is ok
+```
+
+Lines *beginning* in a continuation token (and, additionally `)`), too, are treated as a continuation of the previous line and can have any level of indentation. If they end in a scope token, the following lines must be indented relative to the indentation of the previous line.
+
+```puck
+pub func foo() =
+ print "Hello, world!"
+pub func bar() = # this line is no longer in the above scope.
+ print "Another function declaration."
+```
+
+Dedented lines *not* beginning or ending with a continuation token are treated as no longer in the previous scope, returning to the scope of the according indentation level.
+
+```puck
+if cond then this
+else that
+
+match cond
+of this then ...
+of that then ...
+```
+
+A line beginning with a scope token is treated as attached to the previous expression.
+
+```
+# Technically allowed. Please don't do this.
+let foo
+= ...
+
+if cond then if cond then this
+else that
+
+for i
+in iterable
+do ...
+
+match foo of this then ...
+of that then ...
+
+match foo of this
+then ...
+of that then ...
+```
+
+This *can* lead to some ugly possibilities for formatting that are best avoided.
+
+```
+# Much preferred.
+
+let foo =
+ ...
+let foo = ...
+
+if cond then
+ if cond then
+ this
+else that
+if cond then
+ if cond then this
+else that
+
+for i in iterable do
+ ...
+for i in iterable do ...
+
+match foo
+of this then ...
+of that then ...
+```
+
+The indentation rules are complex, but the effect is such that long statements can be broken *almost* anywhere.
+
+## Expression Rules
+
+First, a word on the distinction between *expressions* and *statements*. Expressions return a value. Statements do not. That is all.
+
+There are some syntactic constructs unambiguously recognizable as statements: all declarations, modules, and `use` statements. There are no syntactic constructs unambiguously recognizable as expressions. As calls returning `void` are treated as statements, and expressions that return a type could possibly return `void`, there is no explicit distinction between expressions and statements made in the parser: or anywhere before type-checking.
+
+Expressions can go almost anywhere. Our indentation rules above allow for it.
+
+```
+# Some different formulations of valid expressions.
+
+if cond then
+ this
+else
+ that
+
+if cond then this
+else that
+
+if cond
+then this
+else that
+
+if cond then this else that
+
+let foo =
+ if cond then
+ this
+ else
+ that
+```
+
+```
+# Some different formulations of *invalid* expressions.
+# These primarily break the rule that everything following a scope token
+# (ex. `=`, `do`, `then`) not at the end of the line must be self-contained.
+
+let foo = if cond then
+ this
+ else
+ that
+
+let foo = if cond then this
+ else that
+
+let foo = if cond then this
+else that
+
+# todo: how to handle this?
+if cond then if cond then that
+else that
+
+# shrimple
+if cond then
+ if cond then that
+else that
+
+# this should be ok
+if cond then this
+else that
+
+match foo of
+this then ...
+of that then ...
+```
## Reserved Keywords
@@ -8,26 +230,25 @@ The following keywords are reserved:
- variables: `let` `var` `const`
- control flow: `if` `then` `elif` `else`
- pattern matching: `match` `of`
-- error handling: `try` `catch` `finally`
+- error handling: `try` `with` `finally`
- loops: `while` `do` `for` `in`
- blocks: `loop` `block` `break` `continue` `return`
- modules: `pub` `mod` `use` `as`
- functions: `func` `varargs`
-- metaprogramming: `macro` `quote` `static` `when`
+- metaprogramming: `macro` `quote` `when`
- ownership: `lent` `mut` `ref` `refc`
-- types: `type` `distinct` `struct` `tuple` `union` `enum` `class`
-- reserved:
- - `impl` `object` `interface` `concept` `auto` `empty` `effect` `case`
- - `suspend` `resume` `spawn` `pool` `thread` `closure`
- - `cyclic` `acyclic` `sink` `move` `destroy` `copy` `trace` `deepcopy`
+- types: `type` `struct` `tuple` `union` `enum` `class`
+
+The following keywords are not reserved, but liable to become so.
+- `impl` `object` `interface` `concept` `auto` `effect` `case`
+- `suspend` `resume` `spawn` `pool` `thread` `closure` `static`
+- `cyclic` `acyclic` `sink` `move` `destroy` `copy` `trace` `deepcopy`
The following identifiers are in use by the standard prelude:
- logic: `not` `and` `or` `xor` `shl` `shr` `div` `mod` `rem`
- logic: `+` `-` `*` `/` `<` `>` `<=` `>=` `==` `!=` `is`
- async: `async` `await`
-- types: `int` `uint` `float`
- - `i8` `i16` `i32` `i64` `i128`
- - `u8` `u16` `u32` `u64` `u128`
+- types: `int` `uint` `float` `i\d+` `u\d+`
- `f32` `f64` `f128`
- `dec64` `dec128`
- types: `bool` `byte` `char` `str`
@@ -51,7 +272,8 @@ The following punctuation is taken:
- `""` (strings)
- `''` (chars)
- ``` `` ``` (unquoting)
-- unused: `~` `$` `%`
+- unused on qwerty: `~` `%` `^` `$`
+ - perhaps leave `$` unused. but `~`, `%`, and `^` totally could be...
## A Formal Grammar
@@ -99,8 +321,8 @@ PRINT ::= LETTER | DIGIT | OPR |
```
Value ::= Int | Float | String | Char | Array | Tuple | Struct
Array ::= '[' (Expr (',' Expr)*)? ']'
-Tuple ::= '(' (Ident ':')? Expr (',' (Ident ':')? Expr)* ')'
-Struct ::= '{' Ident ':' Expr (',' Ident ':' Expr)* '}'
+Tuple ::= '(' (Ident '=')? Expr (',' (Ident '=')? Expr)* ')'
+Struct ::= '{' Ident '=' Expr (',' Ident '=' Expr)* '}'
```
### Variables
@@ -109,8 +331,8 @@ Decl ::= Let | Var | Const | Func | Type
Let ::= 'let' Pattern (':' Type)? '=' Expr
Var ::= 'var' Pattern (':' Type)? ('=' Expr)?
Const ::= 'pub'? 'const' Pattern (':' Type)? '=' Expr
-Pattern ::= Char | String | Number | Float | Ident | '(' Pattern (',' Pattern)* ')'
- Ident '(' Pattern (',' Pattern)* ')'
+Pattern ::= (Ident ('as' Ident)?) | Char | String | Number | Float |
+ Ident? '(' Pattern (',' Pattern)* ')'
```
### Declarations
@@ -121,20 +343,20 @@ Generics ::= '[' Ident (':' Type)? (',' Ident (':' Type)?)* ']'
Parameters ::= '(' Ident (':' Type)? (',' Ident (':' Type)?)* ')'
```
-All arguments to functions must have a type. This is resolved at the semantic level, however.
-(Arguments to macros may lack types. This signifies a generic node.)
+All arguments to functions must have a type. This is resolved at the semantic level, however. (Arguments to macros may lack types. This signifies a generic node.)
### Types
```
TypeDecl ::= 'pub'? 'type' Ident Generics? '=' Type
-Type ::= TypeStruct | TypeTuple | TypeEnum | TypeUnion | TypeClass |
- (Modifier* (Type | ('[' Type ']')))
+Type ::= TypeStruct | TypeTuple | TypeEnum | TypeUnion | SugarUnion |
+ TypeClass | (Modifier* (Type | ('[' Type ']')))
TypeStruct ::= 'struct' ('[' Ident ':' Type (',' Ident ':' Type)* ']')?
TypeUnion ::= 'union' ('[' Ident ':' Type (',' Ident ':' Type)* ']')?
+SugarUnion ::= '(' Ident ':' Type (',' Ident ':' Type)* ')'
TypeTuple ::= 'tuple' ('[' (Ident ':')? Type (',' (Ident ':')? Type)* ']')?
TypeEnum ::= 'enum' ('[' Ident ('=' Expr)? (',' Ident ('=' Expr)?)* ']')?
TypeClass ::= 'class' ('[' Signature (',' Signature)* ']')?
-Modifier ::= 'distinct' | 'ref' | 'refc' | 'ptr' | 'lent' | 'mut' | 'static'
+Modifier ::= 'ref' | 'refc' | 'ptr' | 'lent' | 'mut' | 'const'
Signature ::= Ident Generics? ('(' Type (',' Type)* ')')? (':' Type)?
```
@@ -150,13 +372,13 @@ While ::= 'while' Expr 'do' Body
For ::= 'for' Pattern 'in' Expr 'do' Body
Loop ::= 'loop' Body
Block ::= 'block' Ident? Body
-Static ::= 'static' Body
+Const ::= 'const' Body
Quote ::= 'quote' QuoteBody
```
## Modules
```
-Mod ::= 'pub'? 'mod' Ident Body
+Mod ::= 'pub'? 'mod' Ident '=' Body
Use ::= 'use' Ident ('.' Ident)* ('.' ('[' Ident (',' Ident)* ']'))?
```
@@ -170,14 +392,17 @@ Opr ::= '=' | '+' | '-' | '*' | '/' | '<' | '>' |
```
## Calls and Expressions
+
+This section is (quite) inaccurate due to complexities with respect to significant indentation. Heed caution.
+
```
Call ::= Ident ('[' Call (',' Call)* ']')? ('(' (Ident '=')? Call (',' (Ident '=')? Call)* ')')? |
Ident Call (',' Call)* |
Call Operator Call? |
Call Body
-Expr ::= Let | Var | Const | Func | Type | Mod | Use | Block | Static |
- For | While | Loop | If | When | Try | Match | Call
-Body ::= Expr | (Expr (';' Expr)*)
+Stmt ::= Let | Var | Const | Func | Type | Mod | Use | Expr
+Expr ::= Block | Const | For | While | Loop | If | When | Try | Match | Call
+Body ::= (Stmt ';')* Expr
```
---