aboutsummaryrefslogtreecommitdiff
path: root/docs/SYNTAX.md
blob: 1bd3331b8ab4bf496c7fa48ec6220c8d292b9bb5 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
# Syntax: A Casual and Formal Look

> ! This section is **incomplete**. Proceed with caution.

## Reserved Keywords

The following keywords are reserved:
- variables: `let` `var` `const`
- control flow: `if` `elif` `else`
- pattern matching: `match` `of`
- loops: `loop` `while` `for` `in`
- blocks: `block` `break` `continue` `return`
- functions: `func` `mut` `static` `varargs`
- modules: `pub` `mod` `use` `as`
- error handling: `try` `catch` `finally`
- metaprogramming: `macro` `quote` `when`
- types: `type` `distinct` `ref`
- types: `struct` `tuple` `union` `enum` `interface`
- reserved:
  - `impl` `object` `class` `concept` `auto` `empty` `effect` `case`
  - `suspend` `resume` `spawn` `pool` `thread` `closure`
  - `cyclic` `acyclic` `sink` `move` `destroy` `copy` `trace` `deepcopy`

The following identifiers are in use by the standard prelude:
- logic: `not` `and` `or` `xor` `shl` `shr` `div` `mod` `rem`
- logic: `+` `-` `*` `/` `<` `>` `<=` `>=` `==` `!=` `is`
- async: `async` `await`
- types: `int` `uint` `float`
  - `i8` `i16` `i32` `i64` `i128`
  - `u8` `u16` `u32` `u64` `u128`
  - `f32` `f64` `f128`
  - `dec64` `dec128`
- types: `bool` `byte` `char` `str`
- types: `void` `never`
- strings: `&` (string append)

The following punctuation is taken:
- `=` (assignment)
- `.` (chaining)
- `,` (params)
- `;` (statements)
- `:` (types)
- `#` (comment)
- `_` (unused bindings)
- `|` (generics)
- `\` (string/char escaping)
- `()` (params, tuples)
- `{}` (scope, structs)
- `[]` (generics, lists)
- `""` (strings)
- `''` (chars)
- ``` `` ``` (unquoting)
- unused: `~` `@` `$` `%`

## A Formal Grammar

We now shall take a look at a more formal description of Puck's syntax. 

Syntax rules are described in [extended Backus–Naur form](https://en.wikipedia.org/wiki/Extended_Backus–Naur_form) (EBNF): however, most rules surrounding whitespace, and scope, and line breaks, are modified to how they would appear after a lexing step.

### Identifiers
```
Ident  ::= (Letter | '_') (Letter | Digit | '_')*
Letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' # todo
Digit  ::= '0'..'9'
```

### Literals
```
Int ::= '-'? (DecLit | HexLit | OctLit | BinLit)
Float ::= '-'? DecLit '.' DecLit
BinLit ::= '0b' BinDigit ('_'? BinDigit)*
OctLit ::= '0o' OctDigit ('_'? OctDigit)*
HexLit ::= '0x' HexDigit ('_'? HexDigit)*
DecLit ::= Digit ('_'? Digit)*
BinDigit ::= '0'..'1'
OctDigit ::= '0'..'7'
HexDigit ::= Digit | 'A'..'F' | 'a'..'f'
```

### Chars, Strings, and Comments
```
CHAR    ::= '\'' (PRINT - '\'' | '\\\'')* '\''
STRING  ::= SINGLE_LINE_STRING | MULTI_LINE_STRING
COMMENT ::= SINGLE_LINE_COMMENT | MULTI_LINE_COMMENT | EXPRESSION_COMMENT
SINGLE_LINE_STRING  ::= '"' (PRINT - '"' | '\\"')* '"'
MULTI_LINE_STRING   ::= '"""' (PRINT | '\n' | '\r')* '"""'
SINGLE_LINE_COMMENT ::= '#' PRINT*
MULTI_LINE_COMMENT  ::= '#[' (PRINT | '\n' | '\r' | MULTI_LINE_COMMENT)* ']#'
EXPRESSION_COMMENT  ::= '#;' SINGLE_STMT
PRINT ::= LETTER | DIGIT | OPR |
          '"' | '#' | "'" | '(' | ')' | # notably the dual of OPR
          ',' | ';' | '[' | ']' | '_' |
          '`' | '{' | '}' | ' ' | '\t'
```

### Values
```
Value ::= Int | Float | String | Char | Array | Tuple | Struct
Array  ::= '[' (Expr (',' Expr)*)? ']'
Tuple  ::= '(' (Ident ':')? Expr (',' (Ident ':')? Expr)* ')'
Struct ::= '{' Ident ':' Expr (',' Ident ':' Expr)* '}'
```

### Variables
```
Decl  ::= Let | Var | Const | Func | Type
Let   ::= 'let' Pattern Annotation? '=' Expr
Var   ::= 'var' Pattern Annotation? ('=' Expr)?
Const ::= 'pub'? 'const' Pattern Annotation? '=' Expr
Pattern ::= Char | String | Number | Float | Ident | '(' Pattern (',' Pattern)* ')'
            Ident '(' Pattern (',' Pattern)* ')'
```

### Declarations
```
Func  ::= 'pub'? 'func' Ident Generics? Parameters? Annotation? '=' Body
Macro ::= 'pub'? 'macro' Ident Generics? Parameters? Annotation? '=' Body
Generics   ::= '[' Ident Annotation? (',' Ident Annotation?)* ']'
Parameters ::= '(' Ident Annotation? (',' Ident Annotation?)* ')'
Annotation ::= ':' Type
```

### Types
```
TypeDecl ::= 'pub'? 'type' Ident Generics? '=' Type
Type ::= StructType | TupleType | EnumType | UnionType | Interface |
         (('distinct' | 'ref' | 'ptr' | 'mut' | 'static') (Type | ('[' Type ']'))?)
StructType ::= 'struct' ('[' Ident ':' Type (',' Ident ':' Type)* ']')?
UnionType  ::= 'union'  ('[' Ident ':' Type (',' Ident ':' Type)* ']')?
TupleType  ::= 'tuple' ('[' (Ident ':')? Type (',' (Ident ':')? Type)* ']')?
EnumType   ::= 'enum'  ('[' Ident ('=' Expr)? (',' Ident ('=' Expr)?)* ']')?
Interface ::= 'interface' ('[' Signature (',' Signature)* ']')?
Signature ::= Ident Generics? ('(' Type (',' Type)* ')')? Annotation?
```

## Control Flow
```
If    ::= 'if' Expr ':' Body ('elif' Expr ':' Body)* ('else' ':' Body)?
When  ::= 'when' Expr ':' Body ('elif' Expr ':' Body)* ('else' ':' Body)?
Try   ::= 'try' ':' Body
          ('except' Ident ('as' Ident)? (',' Ident ('as' Ident)?)*) ':' Body)*
          ('finally' ':' Body)?
Match ::= 'match' Expr ('of' Pattern (',' Pattern)* ('where' Expr)? ':' Body)+
Block ::= 'block' Ident? ':' Body
Block ::= 'static' ':' Body
Loop  ::= 'loop' ':' Body
While ::= 'while' Expr ':' Body
For   ::= 'for' Pattern 'in' Expr Body
```

## Modules
```
Mod ::= 'pub'? 'mod' Ident ':' Body
Use ::= 'use' Ident ('/' Ident)* ('/' ('[' Ident (',' Ident)* ']'))?
```

### Operators
```
Operator ::= 'and' | 'or' | 'not' | 'xor' | 'shl' | 'shr' |
             'div' | 'mod' | 'rem' | 'is' | 'in' |
             Opr+
Opr ::= '=' | '+' | '-' | '*' | '/' | '<' | '>' |
        '@' | '$' | '~' | '&' | '%' | '|' |
        '!' | '?' | '^' | '.' | ':' | '\\'
```

## Calls and Expressions
```
Call ::= Ident ('[' Call (',' Call)* ']')? ('(' (Ident '=')? Call (',' (Ident '=')? Call)* ')')? |
         Ident Call (',' Call)* |
         Call Operator Call? |
         Call ':' Body
Expr ::= Let | Var | Const | Func | Type | Mod | Use | Block | Static |
         For | While | Loop | If | When | Try | Match | Call
Body ::= Expr | ('{' Expr (';' Expr)* '}')
```

---

References:
- [Statements vs. Expressions](https://www.joshwcomeau.com/javascript/statements-vs-expressions/)
- [Swift's Lexical Structure](https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html)
- [The Nim Programming Language](https://nim-lang.github.io/Nim/manual.html)
- [Pietro's Notes on Compilers](https://pgrandinetti.github.io/compilers/)