diff options
author | JJ | 2023-12-28 10:18:50 +0000 |
---|---|---|
committer | JJ | 2023-12-28 10:40:26 +0000 |
commit | 774a35ae21dada36af48ae32c862b22587fba107 (patch) | |
tree | 5dfc8153a061192cab76ebaa6c56615dac9e4b09 /docs | |
parent | 0121346e894fb0e7f60312b16986d82109e4d86b (diff) |
docs: sweeping changes. cement an understanding of error handling, async, modules, and unions. rewrite documentation on interfaces. complete the introductory overview. many minor cleanups.
Diffstat (limited to 'docs')
-rw-r--r-- | docs/ASYNC.md | 57 | ||||
-rw-r--r-- | docs/BASIC.md | 139 | ||||
-rw-r--r-- | docs/ERRORS.md | 83 | ||||
-rw-r--r-- | docs/INTEROP.md | 18 | ||||
-rw-r--r-- | docs/METAPROGRAMMING.md | 44 | ||||
-rw-r--r-- | docs/TYPES.md | 211 |
6 files changed, 366 insertions, 186 deletions
diff --git a/docs/ASYNC.md b/docs/ASYNC.md index 5b9fa7e..ec610ca 100644 --- a/docs/ASYNC.md +++ b/docs/ASYNC.md @@ -1,17 +1,50 @@ # Asynchronous Programming -I don't know enough about asynchronous programming to get started with this section. +Puck has [colourless](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/) async/await, heavily inspired by [Zig's implementation](https://kristoff.it/blog/zig-colorblind-async-await/). -Existing systems to learn from: -- https://github.com/status-im/nim-chronos -- https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ -- https://tokio.rs/tokio/tutorial -- https://morestina.net/blog/1686/rust-async-is-colored -- https://ziglearn.org/chapter-5/ -- https://kristoff.it/blog/zig-colorblind-async-await/ -- https://en.wikipedia.org/wiki/Async/await -- https://old.reddit.com/r/elixir/np688d/ +```puck +pub func fetch(url: str): str = ... -Asynchronous programming is hard to design and hard to use. Even Rust doesn't do a great job. It *shouldn't* need built-in language support - we should be able to encode it as a type and provide any special syntax via macros. Note that async is not just threading! threading is solved well by Rust's rayon and Go's (blugh) goroutines. +let a: Future[T] = async fetch_html() +let b: T = a.await +let c: T = await async fetch_html() +``` -Is async worth having separate from effects? +Puck's async implementation relies heavily on its metaprogramming system. + +The `async` macro will wrap a call returning `T` in a `Future[T]` and compute it asynchronously. The `await` function takes in a `Future[T]` and will block until it returns a value (or error). The `Future[T]` type is opaque, containing internal information useful for the `async` and `await` routines. + +```puck +pub macro async(self): Future[T] = + ... todo ... +``` + +```puck +pub func await[T](self: Future[T]): T = + while not self.ready: + block + self.value! # apply callbacks? +``` + +This implementation differs from standard async/await implementations quite a bit. +In particular, this means there is no concept of an "async function" - any block of computation that resolves to a value can be made asynchronous. This allows for "anonymous" async functions, among other things. + +<!-- Asynchronous programming is hard to design and hard to use. Even Rust doesn't do a great job. It *shouldn't* need built-in language support - we should be able to encode it as a type and provide any special syntax via macros. Note that async is not just threading! threading is solved well by Rust's rayon and Go's (blugh) goroutines. --> + +## Threading + +How threads work deserves somewhat of a mention. todo + +References: +- [What color is your function?](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/) +- [What is Zig's "colorblind" async/await?](https://kristoff.it/blog/zig-colorblind-async-await/) +- [Zig Learn: Async](https://ziglearn.org/chapter-5/) +- [Rust async is colored and that's not a big deal](https://morestina.net/blog/1686/rust-async-is-colored) +- [Why is there no need for async/await in Elixir?](https://old.reddit.com/r/elixir/np688d/) +- [Async/await on Wikipedia](https://en.wikipedia.org/wiki/Async/await) +- [nim-chronos](https://github.com/status-im/nim-chronos) +- [nim-cps](https://github.com/nim-works/cps) +- [tokio](https://tokio.rs/tokio/tutorial) +- [Zig-style async/await for Nim](https://forum.nim-lang.org/t/7347) + +Is async worth having separate from effect handlers? I think so... diff --git a/docs/BASIC.md b/docs/BASIC.md index 36a2aa3..ee8e68f 100644 --- a/docs/BASIC.md +++ b/docs/BASIC.md @@ -1,6 +1,8 @@ # An Overview of Puck -Puck is an experimental, high-level, memory-safe, statically-typed, whitespace-sensitive, interface-oriented, imperative programming language with functional underpinnings. It attempts to explore designs in making functional programming paradigms comfortable to those familiar with imperative and object-oriented languages, as well as deal with some more technical problems along the way, such as integrated refinement types and cross-language interop. Primarily, however, this is the language I keep in my head. It reflects the way I think and reason about code. I do hope others enjoy it. +Puck is an experimental, high-level, memory-safe, statically-typed, whitespace-sensitive, interface-oriented, imperative programming language with functional underpinnings. It attempts to explore designs in making functional programming paradigms comfortable to those familiar with imperative and object-oriented languages, as well as deal with some more technical problems along the way, such as integrated refinement types and cross-language interop. Primarily, however, this is the language I keep in my head. It reflects the way I think and reason about code. + +I do hope others enjoy it. ```puck let ident: int = 413 @@ -9,22 +11,22 @@ var phrase = "Hello, world!" const compile_time = when linux: "linux" else: "windows" ``` -Variables may be mutable (`var`), immutable (`let`), or evaluated at compile-time and immutable (`const`). +Variables may be mutable (`var`), immutable (`let`), or compile-time evaluated and immutable (`const`). Type annotations on variables and other bindings follow the name of the binding (with `: Type`), and are typically optional. -Variables are conventionally written in `camelCase`. Types are conventionally written in `PascalCase`. -The type system is comprehensive, and complex enough to [warrant its own document](TYPES.md). +Variables are conventionally written in `snake_case`. Types are conventionally written in `PascalCase`. +The type system is comprehensive, and complex enough to warrant delaying covering until the end. Comments are declared with `#` and run until the end of the line. -Documentation comments may be declared with `##` and will be parsed by language servers and other tooling. +Documentation comments are declared with `##` and may be parsed by language servers and other tooling. Multi-line comments are declared with `#[ ]#` and may be nested. -Taking cues from the Lisp family of languages, a top-level expression may be commented out with `#;` preceding. +Taking cues from the Lisp family of languages, any expression may be commented out with a preceding `#;`. ```puck func reverse(s: str): str = let half_len = s.len div 2 s.get(half_len, s.len)!.reverse() & s.get(half_len, s.len)!.reverse() -pub pure func... +pub func... # May fail! `yeet` denotes functions that can throw pub yeet func pretty_print[T](value: T) = @@ -33,21 +35,20 @@ pub yeet func pretty_print[T](value: T) = print!(value) ``` -Functions are declared with the `func` keyword. They take an (optional) list of generic parameters (in brackets), an (optional) list of parameters (in parentheses), and must be annotated with a return type if they return a type. Every (non-generic) parameter not annotated with a type takes its type from the next parameter. Generic parameters may be each optionally annotated with a type functioning as a *constraint*. +Functions are declared with the `func` keyword. They take an (optional) list of generic parameters (in brackets), an (optional) list of parameters (in parentheses), and **must** be annotated with a return type if they return a type. Every (non-generic) parameter must be annotated with a parameter. Generic parameters may be each optionally annotated with a type functioning as a _constraint_. -Functions, constants, types, and modules may be optionally prefixed with a `pub` modifier denoting visibility outside the current scope (more specifically: module). Functions may also be prefixed with one or more of the following additional modifiers: -- `pure`: denotes a function as a "pure function", lacking side effects, i.e. IO or nondeterminism or parameter mutability -- `yeet`: denotes a function as a "throwing function", that may raise exceptions. -- `async`: marks a function as asynchronous which may only be called by other asynchronous functions or brought to a value with the `await` function -<!-- todo? more? --> +Every function parameter must be explicitly annotated with a type. Their type may also be prefixed with `mut` or `static`: denoting a *mutable* type (types are copied into functions and thus immutable by default), or a *static* type (known to the compiler at compile time, and usable in `const` exprs). -Whitespace is flexible, and functions may be declared entirely on one line if so desired. A new level of indentation after certain tokens (`:`, `=`) denotes a new level of scope. There are some places where arbitrary indentation and line breaks are allowed - as a general rule of thumb, after operators, commas, and opening parentheses. +<!-- Functions, constants, types, and modules may be optionally prefixed with a `pub` modifier denoting visibility outside the current scope / module. More on the module system later. --> -A list of parameters, surrounded by parentheses and separated by commas, may follow the function name. These are optional and a function with no parameters may be followed with `()` or simply nothing at all. More information on function parameters (and return types) is available in the [type system overview](TYPES.md). +Whitespace is significant but flexible: functions may be declared entirely on one line if so desired. A new level of indentation after certain tokens (`:`, `=`) denotes a new level of scope. There are some places where arbitrary indentation and line breaks are allowed - as a general rule of thumb, after operators, commas, and opening parentheses. The particular rules governing indentation may be found in the [syntax guide](SYNTAX.md#indentation-rules). + +```puck +``` Puck supports *uniform function call syntax*: and so any function may be called using the typical syntax for method calls, that is, the first parameter of any function may be appended with a `.` and moved to precede it, in the style of a typical method. (There are no methods in Puck. All functions are statically dispatched. This may change in the future.) -This allows for a number of syntactic cleanups. Arbitrary functions with compatible types may be chained with no need for a special pipe operator. Struct/tuple field access, module field access, and function calls are unified, reducing the need for getters and setters. Given a first type, IDEs using dot-autocomplete can fill in all the functions defined for that type. Programmers from object-oriented languages may find the lack of classes more bearable. UFCS is implemented in shockingly few languages, and so Puck joins the tiny club that previously consisted of just D and Nim. +This allows for a number of syntactic cleanups. Arbitrary functions with compatible types may be chained with no need for a special pipe operator. Object field access, module member access, and function calls are unified, reducing the need for getters and setters. Given a first type, IDEs using dot-autocomplete can fill in all the functions defined for that type. Programmers from object-oriented languages may find the lack of classes more bearable. UFCS is implemented in shockingly few languages, and so Puck joins the tiny club that previously consisted of just D and Nim. ```puck ``` @@ -59,29 +60,40 @@ Boolean logic and integer operations are standard and as one would expect out of - boolean operators are bitwise and also apply to integers and floats - more operators are available via the standard library -The above operations are performed with *operators*, special functions that take a prefixed first argument and (often) a suffixed second argument. Custom operators may be declared like functions, with their name in backticks, and the restriction that they must be composed of the following punctuation tokens: todo. This restriction is to ensure the parser remains context free. +The above operations are performed with *operators*, special functions that take a prefixed first argument and (often) a suffixed second argument. Custom operators may be implemented, but they must consist of only a combination of the symbols `=` `+` `-` `*` `/` `<` `>` `@` `$` `~` `&` `%` `|` `!` `?` `^` `\` for the purpose of keeping the grammar context-free. They are are declared identically to functions. + +Term (in)equality is expressed with the `==` and `!=` operators. Type equality is expressed with `is`. Subtyping relations may be queried with `of`, which has the additional property of introducing new bindings in the current scope (more on this in the [types document](TYPES.md)). Membership of collections is expressed with `in`, and is overloaded for most types. -Term in/equality is expressed with the `==` and `!=` operators. Type in/equality is expressed with `is` and `not (T is U)`, and subtyping may be queried with `of` (more on this in the [types document](TYPES.md)). Set logic is expressed with `in` and `not (x in Y)`, and is overloaded for not just sets but collections of any sort. +```puck +``` -String concatenation uses a distinct `&` operator rather than overloading the `+` operator (as the complement `-` has no natural meaning for strings). Strings are unified, mutable, internally a byte array, externally a char array, and are stored as a pointer to heap data after their length and capacity (fat pointer). Slices of strings are stored as a length followed by a pointer to string data, and have non-trivial interactions with the memory management system. Chars are four bytes and represent a Unicode character in UTF-8 encoding. More details can be found in the [type system overview](TYPES.md). +String concatenation uses a distinct `&` operator rather than overloading the `+` operator (as the complement `-` has no natural meaning for strings). Strings are unified, mutable, internally a byte array, externally a char array, and are stored as a pointer to heap data after their length and capacity (fat pointer). Chars are four bytes and represent a Unicode character in UTF-8 encoding. Slices of strings are stored as a length followed by a pointer to string data, and have non-trivial interactions with the memory management system. More details can be found in the [type system overview](TYPES.md). ```puck ``` -Basic conditional control flow is standard via `if/elif/else` statements. The `when` statement provides a compile-time `if`. It also takes `elif` and `else` branches and is syntactic sugar for an `if` statement within a `static` block (more on those later). Exhaustive structural pattern matching is available with the `match/of` statement, and is particularly useful for the `union` type. Branches of a `match` statement take a *pattern*, of which the unbound identifiers within will be injected into the branch's scope. Multiple patterns may be used for one branch provided they all bind the same identifiers of the same type. Branches may be *guarded* with the `where` keyword, which takes a conditional, and will necessarily remove the branch from exhaustivity checks. +Basic conditional control flow uses standard `if`/`elif`/`else` statements. The `when` statement provides a compile-time `if`. It also takes `elif` and `else` branches and is syntactic sugar for an `if` statement within a `static` block (more on those later). + +All values in Puck must be handled, or explicitly discarded. This allows for conditional statements and many other control flow constructs to function as *expressions*, and evaluate to a value, when an unbound value is left at the end of each of their branches' scopes. This is particularly relevant for *functions*, where it is often idiomatic to omit an explicit `return` statement. There is no attempt made to differentiate without context, and so expressions and statements often look identical in syntax. -The `of` statement also stands on its own as a conditional for querying subtype equality. It retains the variable injection properties of its counterpart within `match` statements. This allows it to be used as a compact and coherent alternative to `if let` statements in other languages. +```puck +``` -All values in Puck must be handled, or explicitly discarded. This allows for conditional statements and many other control flow constructs to function as *expressions*, and evaluate to a value, when an unbound value is left at the end of each of their branches' scopes. Puck makes no attempt to determine this without context, and so expressions and statements look identical in syntax and semantics (AST). +Exhaustive structural pattern matching is available with the `match`/`of` statement, and is particularly useful for the `struct` and `union` types. Branches of a `match` statement take a *pattern*, of which the unbound identifiers within will be injected into the branch's scope. Multiple patterns may be used for one branch provided they all bind the same identifiers of the same type. Branches may be *guarded* with the `where` keyword, which takes a conditional, and will necessarily remove the branch from exhaustivity checks. + +<!-- todo: structural matching of lists and arrays --> + +The `of` statement also stands on its own as a conditional for querying subtype equality. Used as a conditional in `if` statements, it retains the variable injection properties of its `match` counterpart. This allows it to be used as a compact <!-- and coherent --> alternative to `if let` statements in other languages. ```puck +func may_fail: Result[T, ref Err] ``` -Error handling is done with a fusion of imperative `try/catch` statements and functional `Option/Result` types, with much syntactic sugar. Functions may `raise` errors, but should return `Option[T]` or `Result[T, E]` types instead by convention. Those that `raise` errors or call functions that `raise` errors without handling them must additionally be explicitly marked as `yeet`. This is purely to encourage safe error handling, and is not absolute - there will likely be several builtins considered safe by compiler magic. (??? what are those?) +Error handling is done via a fusion of imperative `try`/`catch` statements and functional `Option`/`Result` types, with much syntactic sugar. Functions may `raise` errors, but should return `Option[T]` or `Result[T, E]` types instead by convention. <!-- Those that `raise` errors or call functions that `raise` errors without handling them must additionally be explicitly marked as `yeet`. This is purely to encourage safe error handling, and is not absolute - there will likely be several builtins considered safe by compiler magic.--> <!-- todo --> -A bevy of helper functions and macros are available for `Option/Result` types, and are documented and available in the `std/options` module (imported by default). Two in particular are of note: the `?` macro accesses the inner value of a `Result[T, E]` or propagates (returns in context) the `Error(e)`, and the `!` accesses the inner value of an `Option[T]` or `Result[T, E]` or raises the `Error(e)` or a an error on `None` or `Error`. Both are operators taking one parameter and so are postfix. +A bevy of helper functions and macros are available for `Option`/`Result` types, and are documented and available in the `std.options` and `std.results` modules (included in the prelude by default). Two in particular are of note: the `?` macro accesses the inner value of a `Result[T, E]` or propagates (returns in context) the `Error(e)`, and the `!` accesses the inner value of an `Option[T]` / `Result[T, E]` or raises an error on `None` / the specific `Error(e)`. Both operators take one parameter and so are postfix. (There is additionally another `?` postfix macro, taking in a type, as a shorthand for `Option[T]`) -The utility of the `?` macro is readily apparent to anyone who has written code in Rust or Swift. The utility of the `!` function is perhaps less so obvious. These errors raised by `!`, however, are known to the compiler: and they may be comprehensively caught by a single or sequence of `catch` statements. This allows for users used to a `try/catch` error handling style to do so with ease, with only the need to add one additional character to a function call. +The utility of the `?` macro is readily apparent to anyone who has written code in Rust or Swift. The utility of the `!` function is perhaps less so obvious. These errors raised by `!`, however, are known to the compiler: and they may be comprehensively caught by a single or sequence of `catch` statements. This allows for users used to a `try`/`catch` error handling style to do so with ease, with only the need to add one additional character to a function call. More details may be found in [error handling overview](ERRORS.md). @@ -90,9 +102,9 @@ loop: break ``` -Three types of loops are available: `for` loops, `while` loops, and infinite loops (`loop` loops). For loops take a binding (which may be structural, see pattern matching) and an iterable object and will loop until the iterable object is spent. While loops take a condition that is executed upon the beginning of each iteration to determine whether to keep looping. Infinite loops are, well, infinite and must be manually broken out of. +Three types of loops are available: `for` loops, `while` loops, and infinite loops (`loop` loops). For loops take a binding (which may be structural, see pattern matching) and an iterable object and will loop until the iterable object is spent. While loops take a condition that is executed upon the beginning of each iteration to determine whether to keep looping. Infinite loops are infinite are infinite are infinite are infinite are infinite are infinite and must be manually broken out of. -There is no special concept of iterators: iterable objects are any object that implements the `Iter[T]` interface (more on those in [the type system document](TYPES.md)), that is, provides a `self.next()` function returning an Optional type. For loops can be thought of as while loops that unwrap the result of the `next()` function and end iteration upon a `None` value. While loops, in turn, can be thought of as infinite loops with an explicit conditional break. +There is no special concept of iterators: iterable objects are any object that implements the `Iter[T]` interface (more on those in [the type system document](TYPES.md)), that is, provides a `self.next()` function returning an Optional type. As such, iterators are first-class constructs. For loops can be thought of as while loops that unwrap the result of the `next()` function and end iteration upon a `None` value. While loops, in turn, can be thought of as infinite loops with an explicit conditional break. The `break` keyword immediately breaks out of the current loop, and the `continue` keyword immediately jumps to the next iteration of the current loop. Loops may be used in conjunction with blocks for more fine-grained control flow manipulation. @@ -105,9 +117,10 @@ let x = block: transform_input(y) block foo: - block bar: - for i in 0..=100: + for i in 0 ..= 100: + block bar: if i == 10: break foo + print i ``` Blocks provide arbitrary scope manipulation. They may be labelled or unlabelled. The `break` keyword additionally functions inside of blocks and without any parameters will jump out of the current enclosing block (or loop). It may also take a block label as a parameter for fine-grained scope control. @@ -117,30 +130,82 @@ Blocks provide arbitrary scope manipulation. They may be labelled or unlabelled. Code is segmented into modules. Modules may be made explicit with the `mod` keyword followed by a name, but there is also an implicit module structure in every codebase that follows the structure and naming of the local filesystem. For compatibility with filesystems, and for consistency, module names are exclusively lowercase (following the same rules as Windows). -Within modules, constants, functions, types, and other modules may be *exported* for use by other modules with the `pub` keyword. All such identifiers are private by default within a module and only accessible locally. The imported modules, constants, functions, types, etc within imported modules may be *re-exported* for use by other modules with the `export` keyword. Modules are first-class and may be bound, inspected, modified, and returned. +A module can be imported into another module by use of the `use` keyword, taking a path to a module or modules. Contrary to the majority of languages ex. Python, unqualified imports are *encouraged* - in fact, are idiomatic (and the default) - type-based disambiguation and official LSP support are intended to remove any ambiguity. -A module can be imported into another module by use of the `use` keyword, taking a path to a module or modules. Contrary to the majority of languages ex. Python, unqualified imports are *encouraged*: type-based disambiguation and official LSP support are intended to remove any ambiguity. +Within a module, functions, types, constants, and other modules may be *exported* for use by other modules with the `pub` keyword. All such identifiers are private by default and only accessible module-locally without. Modules are first-class and may be bound, inspected, modified, and returned. As such, imported modules may be *re-exported* for use by other modules by binding them to a public constant, i.e. `use my_module; pub const my_module = my_module`. More details may be found in the [modules document](MODULES.md). ```puck ``` -Compile-time programming may be done via the previously-mentioned `const` keyword and `when` statements: or via `static` blocks. All code within a `static` block is evaluated at compile-time and all assignments made are propagated to the compiled binary. +Compile-time programming may be done via the previously-mentioned `const` keyword and `when` statements: or via `const` *blocks*. All code within a `const` block is evaluated at compile-time and all assignments and allocations made are propagated to the compiled binary as static data. Further compile-time programming may be done via metaprogramming: compile-time manipulation of the abstract syntax tree. The macro system is complex, and a description may be found in the [metaprogramming document](METAPROGRAMMING.md). ```puck -``` +func await(promise: Promise) +pub async func -Threading support is complex and regulated to external libraries (with native syntax via macros). OS-provided primitives will likely provide a `spawn` function, and there will be substantial restrictions for memory safety. I haven't thought much about this. +await +``` -Async support is complex and relegated to external libraries (with native syntax via macros). More details may be found in the [async document](ASYNC.md). It is likely that this will look like Zig, with `async`/`await`/`suspend`/`resume`. +The async system is *colourblind*: the special `async` macro will turn any function *call* returning a `T` into an asynchronous call returning a `Future[T]`. The special `await` function will wait for any `Future[T]` and return a `T` (or an error). Async support is included in the standard library in `std.async` in order to allow for competing implementations. More details may be found in the [async document](ASYNC.md). -Effects are complex and lack any sort of design structure. More details may be found in the [effects document](EFFECTS.md). +Threading support is complex and also regulated to external libraries. OS-provided primitives will likely provide a `spawn` function, and there will be substantial restrictions for memory safety. I really haven't given much thought to this. ```puck ``` Details on memory safety, references and pointers, and deep optimizations may be found in the [memory management overview](MEMORY_MANAGEMENT.md). -The memory model intertwines deeply with the type system. +The memory model intertwines deeply with the type system. <!-- todo --> + +```puck +``` + +Finally, a few notes on the type system are in order. + +Types are declared with the `type` keyword and are transparent aliases. +That is, `type Foo = Bar` means that any function defined for `Bar` is defined for `Foo` - that is, objects of type `Foo` can be used any time an object of type `Bar` is called for. +If such behavior is not desired, the `distinct` keyword forces explicit qualification and conversion of types. `type Foo = distinct Baz` will force a type `Foo` to be wrapped in a call to the constructor `Baz()` before being passed to such functions. + +Types, like functions, can be *generic*: declared with "holes" that may be filled in with other types upon usage. A type must have all its holes filled before it can be constructed. The syntax for generics in types much resembles the syntax for generics in functions, and *constraints* and the like also apply. + +```puck +let myStruct = struct + a: int + b: int +let myTuple = tuple[int, b: int] +print myTuple.1 +``` + +Struct and tuple types are declared with `struct[<fields>]` and `tuple[<fields>]`, respectively. Their declarations make them look similar at a glance, but they differ fairly fundamentally. Structs are *unordered*, and every field must be named. They may be constructed with `{}` brackets. Tuples are *ordered* and so field names are optional - names are just syntactic sugar for positional access. Tuples may be constructed with `()` parenthesis. + +Puck's type system is *structural*, and there is no better example of what this entails than with structs... todo. This allows for power at the cost of clarity, zero boilerplate multiple inheritance, etc + +It is worth noting that there is no concept of `pub` at a field level on structs - a type is either fully transparent, or fully opaque. This is because such partial transparency breaks with structural initialization (how could one provide for hidden fields?). An idiomatic workaround is to model the desired field structure with a public-facing interface. + +```puck +type Expr = union + Variable(int) + Abstraction() + Application() # much better +``` + +```puck +pub type Iter[T] = interface + next(mut Self): T? + +pub type Peek[T] = interface + next(mut Self): T? + peek(mut Self): T? + peek_nth(mut Self, int): T? +``` + +Interface types function much as type classes in Haskell or traits in Rust do. They are not concrete types, and cannot be constructed - instead, their utility is via indirection, as parameters or as `ref` types, providing constraints that some concrete type must meet. They consist of a list of a list of function signatures, implementations of which must exist for the given type in order to compile. + +Their major difference, however, is that Puck's interfaces are *implicit*: there is no `impl` block that implementations of their associated functions have to go under. If functions for a concrete type exist satisfying some interface, the type implements that interface. This does run the risk of accidentally implementing an interface one does not desire to, but the author believes such situations are few and far between, well worth the decreased syntactic and semantic complexity, and mitigatable with tactical usage of the `distinct` keyword. + +As the compiler makes no such distinction between fields and single-argument functions on a type when determining identifier conflicts, interfaces similarly make no such distinction. They *do* distinguish mutable and immutable parameters, those being part of the type signature. + +Interfaces are widely used throughout the standard library to provide general implementations of such conveniences like iteration, debug and display printing, generic error handling, and much more. diff --git a/docs/ERRORS.md b/docs/ERRORS.md index 4a4b206..a6e7a6a 100644 --- a/docs/ERRORS.md +++ b/docs/ERRORS.md @@ -1,24 +1,89 @@ # Error Handling -Error handling should perhaps be abstracted into a more general effects system. -But if not, then this document lays out some potential ideas. +Puck's error handling is shamelessly stolen from Swift. +It uses a combination of Option/Result types and try/catch/finally statements, and leans somewhat on Puck's metaprogramming capabilities. ---- +```puck +func get_debug[T](): T = + let value: Option[T] = self.unsafe_get(413) + try: + let value = value! + catch Exception(e) + + +try: + .. +catch: + .. +finally: + print "No such errors" +``` + +There are several ways to handle errors in Puck. If the error is encoded in the type, one can: +1. `match` on the error +2. compactly match on the error with `if ... of` +3. propagate the error with `?` +4. throw the error with `!` + +If an error is thrown, one must explicitly handle (or disregard) it with a `try/catch` block. +This method of error handling may feel more familiar to Java programmers. + +## Errors as monads + +Puck provides [`Option[T]`](std/default/options.pk) and a [`Result[T, E]`](std/default/results.pk) types, imported by default. These are `union` types and so must be pattern matched upon to be useful: but the standard library provides [a bevy of helper functions](std/default/results.pk). +Two in particular are of note. The `?` operator unwraps a Result or propagates its error up a function call (and may only be used in type-appropriate contexts). The `!` operator unwraps an Option or Result directly or throws an exception in the case of None or Error. ```puck +pub macro `?`[T, E](self: Result[T, E]) = + quote: + match `self` + of Okay(x): x + of Error(e): return Error(e) ``` -Puck provides [`Option[T]`](std/default/options.pk) and a [`Result[T, E]`](std/default/results.pk) types, imported by default. These are `union` types and so must be pattern matched upon to be useful: but the standard library provides a bevy of helper functions. +```puck +pub func `!`[T](self: Option[T]): T = + match self + of Some(x): x + of None: raise EmptyValue -Two in particular are of note. The `?` operator unwraps a Result or propagates its error up a function call. The `!` operator unwraps an Option or Result directly or throws an exception in the case of None or Error. +pub func `!`[T, E](self: Result[T, E]): T = + of Okay(x): x + of Error(e): raise e +``` + +The utility of the provided helpers in [`std.options`](std/default/options.pk) and [`std.results`](std/default/results.pk) should not be understated. While encoding errors into the type system may appear restrictive at first glance, some syntactic sugar goes a long way in writing compact and idiomatic code. Java programmers in particular are urged to give type-first errors a try, before falling back on unwraps and `try`/`catch`. + +A notable helpful type is the aliasing of `Result[T]` to `Result[T, ref Err]`, for when the particular error does not matter. This breaks `try`/`catch` exhaustion (as `ref Err` denotes a reference to *any* Error), but is particularly useful when used in conjunction with the propagation operator. + +## Errors as catchable exceptions + +Errors raised by `raise`/`throw` (or subsequently the `!` operator) must be explicitly caught and handled via a `try`/`catch`/`finally` statement. +If an exception is not handled within a function body, the function must be explicitly marked as a throwing function via the `yeet` prefix (name to be determined). The compiler will statically determine which exceptions in particular are thrown from any given function, and enforce them to be explicitly handled or explicitly ignored. + +Errors are types. An error thrown from an unwrapped `Result[T, E]` is of type `E`. `catch` statements, then, may pattern match upon possible errors, behaving similarly to `of` branches. ```puck +try: + ... +catch "Error": + ... +finally: + ... ``` -Errors raised by the `!` operator must be explicitly caught and handled via a `try/catch/finally` statement. +This creates a distinction between two types of error handling, working in sync: functional error handling with [Option](https://en.wikipedia.org/wiki/Option_type) and [Result](https://en.wikipedia.org/wiki/Result_type) types, and object-oriented error handling with [catchable exceptions](https://en.wikipedia.org/wiki/Exception_handling). These styles may be swapped between with minimal syntax overhead. Libraries, however, should universally use `Option`/`Result`, as this provides the best support for both styles. + +<!-- [nullable types](https://en.wikipedia.org/wiki/Nullable_type)?? --> + +## Unrecoverable exceptions -If an exception is not handled within a function body, the function must be explicitly marked as a throwing function via the `yeet` prefix (final name to be determined). The compiler will statically determine which exceptions in particular are thrown from any given function. +There exist errors from which a program can not reasonably recover. These are the following: +- `Assertation Failure`: a call to an `assert` function has returned false at runtime. +- `Out of Memory`: the executable is out of memory. +- `Stack Overflow`: the executable has overflowed the stack. +- any others? -This creates a distinction between two types of error handling, working in sync: functional error handling with [Option](https://en.wikipedia.org/wiki/Option_type) and [Result](https://en.wikipedia.org/wiki/Result_type) types, and object-oriented error handling with [nullable types](https://en.wikipedia.org/wiki/Nullable_type) and [exceptions](https://en.wikipedia.org/wiki/Exception_handling). These styles may be swapped between with minimal syntax overhead. Libraries, however, should universally use Options and Results, as this provides best for both usages. +They are not recoverable, but the user should be aware of them as possible failure conditions. -References: [std/options](std/default/options.pk), [std/results](std/default/results.pk), [Error Handling in Swift](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/errorhandling) (shamelessly stolen) +References: [Error Handling in Swift](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/errorhandling) diff --git a/docs/INTEROP.md b/docs/INTEROP.md index 805f747..f595afa 100644 --- a/docs/INTEROP.md +++ b/docs/INTEROP.md @@ -1,32 +1,32 @@ # Interop with Other Languages -A major goal of Puck is _universal, minimal-overhead language interoperability while maintaining type safety_. +A major goal of Puck is _minimal-overhead language interoperability_ while maintaining type safety. There are three issues that complicate language interop: 1. Conflicting memory management systems, i.e. Boehm GC vs. reference counting 2. Conflicting type systems, i.e. Python vs. Rust -3. The C ABI. +3. The language of communication, i.e. the C ABI. For the first, Puck uses what amounts to a combination of ownership and reference counting: and thus it is exchangeable in this regard with Nim (same system), Rust (ownership), Swift (reference counting), and many others. (It should be noted that ownership systems are broadly compatible with reference counting systems). For the second, Puck has a type system of similar capability to that of Rust, Nim, and Swift: and thus interop with those languages should be straightforward for the user. Its type system is strictly more powerful than that of Python or C, and so interop requires additional help. Its type system is equally as powerful as but somewhat orthogonal to Java's, and so interop is a little more difficult. -For the third, Puck is being written at the same time as the crABI ABI spec is in development. crABI promises a C-ABI-compatible, cross-language ABI spec, which would *dramatically* simplify the task of linking to object files produced by other languages. It is being led by the Rust language team, and both Nim and Swift have expressed interest in it. Which bodes quite well for future... +For the third, Puck is being written at the same time as the crABI ABI spec is in development. crABI promises a C-ABI-compatible, cross-language ABI spec, which would *dramatically* simplify the task of linking to object files produced by other languages. It is being led by the Rust language team, and both the Nim and Swift teams have expressed interest in it, which bodes quite well for its future. Languages often focus on interop from purely technical details. This *is* very important: but typically no thought is given to usability (and often none can be, for necessity of compiler support), and so using foreign function interfaces very much feel like using *foreign* interfaces. Puck attempts to change that. ...todo... Existing systems to learn from: -- https://doc.rust-lang.org/reference/abi.html +- [The Rust ABI](https://doc.rust-lang.org/reference/abi.html) - https://www.hobofan.com/rust-interop/ -- https://github.com/eqrion/cbindgen +- [CBindGen](https://github.com/eqrion/cbindgen) - https://github.com/chinedufn/swift-bridge - https://kotlinlang.org/docs/native-c-interop.html - https://github.com/crackcomm/rust-lang-interop - https://doc.rust-lang.org/reference/abi.html - https://doc.rust-lang.org/reference/items/functions.html#extern-function-qualifier -- https://github.com/yglukhov/nimpy -- https://github.com/yglukhov/jnim -- https://github.com/PMunch/futhark -- https://lib.haxe.org/p/callfunc/ +- [NimPy](https://github.com/yglukhov/nimpy) +- [JNim](https://github.com/yglukhov/jnim) +- [Futhark](https://github.com/PMunch/futhark) +- [Haxe's `callfunc`](https://lib.haxe.org/p/callfunc/) diff --git a/docs/METAPROGRAMMING.md b/docs/METAPROGRAMMING.md index fd928a8..b6a4165 100644 --- a/docs/METAPROGRAMMING.md +++ b/docs/METAPROGRAMMING.md @@ -1,14 +1,14 @@ # Metaprogramming -Puck has rich metaprogramming support. Many features that would have to be at the compiler level in most languages (error propagation `?`, `std.fmt.print`, ...) are instead implemented as macros within the standard library. +Puck has rich metaprogramming support, heavily inspired by Nim. Many features that would have to be at the compiler level in most languages (error propagation `?`, `std.fmt.print`, `async`/`await`) are instead implemented as macros within the standard library. Macros take in fragments of the AST within their scope, transform them with arbitrary compile-time code, and spit back out transformed AST fragments to be injected and checked for validity. This is similar to what Nim and the Lisp family of languages do. -By keeping an intentionally minimal AST, some things not possible to express in literal code may be expressible in the AST: in particular, bindings can be injected in many places they could not be injected in ordinarily. +By keeping an intentionally minimal AST, some things not possible to express in literal code may be expressible in the AST: in particular, bindings can be injected in many places they could not be injected in ordinarily. (A minimal AST also has the benefit of being quite predictable.) -Macros may not change Puck's syntax: the syntax is flexible enough. Code is syntactically checked (parsed), but *not* semantically checked (typechecked) before being passed to macros. This may change in the future (to require arguments to be semantically correct). Macros have the same scope as other routines, that is: +Macros may not change Puck's syntax: the syntax is flexible enough. Code is syntactically checked (parsed), but *not* semantically checked (typechecked) before being passed to macros. This may change in the future<!-- (to require arguments to be semantically correct)-->. Macros have the same scope as other routines, that is: **function scope**: takes the arguments within or following a function call -``` +```puck macro print(params: varargs) = for param in params: result.add(quote(stdout.write(`params`.str))) @@ -18,7 +18,7 @@ print "hello", " ", "world", "!" ``` **block scope**: takes the expression following a colon as a single argument -``` +```puck macro my_macro(body) my_macro: @@ -27,31 +27,43 @@ my_macro: 3 4 ``` + **operator scope**: takes one or two parameters either as a postfix (one parameter) or an infix (two parameters) operator -``` -macro `+=`(a, b) = +```puck +macro +=(a, b) = quote: `a` = `a` + `b` a += b +``` + +Macros typically take a list of parameters *without* types, but they optionally may be given a type to constrain the usage of a macro. Regardless: as macros operate at compile time, their parameters are not instances of a type, but rather an `Expr` expression representing a portion of the *abstract syntax tree*. +Similarly, macros always return an `Expr` to be injected into the abstract syntax tree despite the usual absence of an explicit return type, but the return type may be specified to additionally typecheck the returned `Expr`. -macro `?`[T, E](self: Result[T, E]) = +```puck +``` + +As macros operate at compile time, they may not inspect the *values* that their parameters evaluate to. However, parameters may be marked with `static[T]`: in which case they will be treated like parameters in functions: as values. (note static parameters may be written as `static[T]` or `static T`.) There are many restrictions on what might be `static` parameters. Currently, it is constrained to literals i.e. `1`, `"hello"`, etc, though this will hopefully be expanded to any function that may be evaluated statically in the future. + +```puck +macro ?[T, E](self: Result[T, E]) = quote: match self of Okay(x): x of Error(e): return Error(e) -func meow(): Result[bool, ref Err] = - ... +func meow: Result[bool, ref Err] = + let a = stdin.get()? ``` -Macros typically take a list of parameters *without* types, but they optionally may be given a type to constrain the usage of a macro. Regardless: as macros operate at compile time, their parameters are not instances of a type, but rather an `Expr` expression representing a portion of the *abstract syntax tree*. -Similarly, macros always return an `Expr` to be injected into the abstract syntax tree despite the usual absence of an explicit return type, but the return type may be specified to additionally typecheck the returned `Expr`. +The `quote` macro is special. It takes in literal code and returns that code **as the AST**. Within quoted data, backticks may be used to break out in order to evaluate and inject arbitrary code: though the code must evaluate to an expression of type `Expr`. <!-- Variables (of type `Expr`) may be *injected* into the literal code by wrapping them in backticks. This reuse of backticks does mean that defining new operators is impossible within quoted code. --> -As macros operate at compile time, they may not inspect the *values* that their parameters evaluate to. However, parameters may be marked with `static[T]`: in which case they will be treated as parameters in functions, as values. (note static parameters may be written as `static[T]` or `static T`.) There are many restrictions on what might be `static` parameters. Currently, it is constrained to literals i.e. `1`, `"hello"`, etc, though this will hopefully be expanded to any function that may be evaluated statically in the future. +```puck +``` -The `quote` macro (yes, macros may exist within macros) is special. It takes in literal code and returns that code **as the AST**. Within quoted data, backticks may be used to break out to evaluate arbitrary code: though it must evaluate to an expression of type `Expr`. Variables (of type `Expr`) may be *injected* into the literal code by wrapping them in backticks. This reuse of backticks does mean that defining new operators is impossible within quoted code. +The `Expr` type is available from `std.ast`, as are many helpers, and combined they provide the construction of arbitrary syntax trees (indeed, `quote` relies on and emits types of it). It is a `union` type with its variants directly corresponding to the variants of the internal AST of Puck. -The `Expr` type is available from `std.ast`, as are many helpers, and combined they provide the construction of arbitrary syntax trees (indeed, `quote` relies on and emits types of it). It is a `union` type with variants directly corresponding to the variants of the internal AST of Puck. +```puck +``` -Construction of macros can be difficult: several helpers are provided to ease debugging. The `Debug` and `Display` interfaces are implemented for abstract syntax trees: `dbg` will print a representation of the passed syntax tree as an object, and `print` will print a best-effort representation as literal code. Together with `quote` and optionally with `static`, these can be used to quickly get the representation of arbitrary code. todo: `std.ast.expand`... +Construction of macros can be difficult: and so several helpers are provided to ease debugging. The `Debug` and `Display` interfaces are implemented for abstract syntax trees: `dbg` will print a representation of the passed syntax tree as an object, and `print` will print a best-effort representation as literal code. Together with `quote` and optionally with `static`, these can be used to quickly get the representation of arbitrary code. diff --git a/docs/TYPES.md b/docs/TYPES.md index e674739..0e463ec 100644 --- a/docs/TYPES.md +++ b/docs/TYPES.md @@ -1,6 +1,6 @@ # Typing in Puck -Puck has a comprehensive static type system. +Puck has a comprehensive static type system, inspired by the likes of Nim, Rust, and Swift. ## Basic types @@ -10,7 +10,6 @@ Basic types can be one-of: - `uint`: same as `int`, but unsigned for more precision. - `i8`, `i16`, `i32`, `i64`, `i128`: specified integer size - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size - - overflows into bigints for safety and ease of cryptographical code. - `float`: floating-point number. - `f32`, `f64`: specified float sizes - `byte`: an alias to `u8`. @@ -35,8 +34,8 @@ todo Strings are: - mutable -- internally a byte (uint8) array -- externally a char (uint32) array +- internally a byte array +- externally a char (four bytes) array - prefixed with their length and capacity - automatically resize like a list @@ -45,15 +44,15 @@ They are also quite complicated. Puck has full support for Unicode and wishes to Container types, broadly speaking, are types that contain other types. These exclude the types in [advanced types](#advanced-types). -### Iterable types +### Container types -Iterable types can be one-of: -- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. - - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`. +Container types can be one-of: +- `array[S, T]`: Fixed-size arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. + - Initialized in-place with `[a, b, c]`. - `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically. - - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`. + - Initialized in-place with `[a, b, c]`. (this is the same as arrays!) <!-- Disambiguated from arrays in much the same way uints are disambiguated from ints. --> - `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. - - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). + - Cannot be directly constructed. They are **unsized**. <!-- May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). --> - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. - `str`: Strings. Complicated, they are alternatively `list[byte]` or `list[char]` depending on who's asking. In the context of iterable types, they are treated as `list[char]`. @@ -68,9 +67,9 @@ Elements of container types can be accessed by the `container[index]` syntax. Sl There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided. Abstract data types can be one-of: -- `set[T]`: high-performance sets implemented as a bit array. +- `BitSet[T]`: high-performance sets implemented as a bit array. - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead. -- `table[T, U]`: simple symbol tables implemented as an association list. +- `AssocTable[T, U]`: simple symbol tables implemented as an association list. - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead. - `HashSet[T]`: standard hash sets. - `HashTable[T, U]`: standard hash tables. @@ -114,18 +113,20 @@ func length[T](a: T) = # lots of things use .len, but only a few called by this do. # throws a warning if exported for lack of specitivity. -func length[T: str | list](a: T) = +func length(a: str | list) = return a.len ``` The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may then refer to the generic types). -Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. +Generics are replaced with concrete types at compile time (monomorphization) based on their usage in function calls within the main function body. -Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. +Constrained generics have two syntaxes: the constraint can be defined directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. -## Reference types +Other constructions like modules and type declarations themselves may also be generic. -Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve unsized types (notably including parameter types and interfaces!), would not be allowed. However, Puck provides two avenues for indirection. +## Reference Types + +Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve unsized types (notably including parameter types), would not be allowed. However, Puck provides two avenues for indirection. Reference types can be one-of: - `ref T`: An automatically-managed reference to type `T`. This is a pointer of size `uint` (native). @@ -151,37 +152,22 @@ type UnsafeTree = struct right: ptr UnsafeTree ``` -The `ref` prefix may be placed at the top level of type declarations, or inside on a field of a structural type. `ref` types may often be more efficient when dealing with large data structures. They also provide for the usage of parameter types (except for `static` and `var`) within type declarations. +The `ref` prefix may be placed at the top level of type declarations, or inside on a field of a structural type. `ref` types may often be more efficient when dealing with large data structures. They also provide for the usage of unsized types (functions, interfaces, slices) within type declarations. -The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). +The compiler abstracts over `ref` types to provide optimization for reference counts: and so a distinction between `Rc`/`Arc`/`Box` is not needed. Furthermore, access implicitly dereferences (with address access available via `.addr`), and so a `*` dereference operator is also not needed. Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). -The implementation of reference types is delved into in further detail in the [document on memory management](MEMORY_MANAGEMENT.md). +The implementation of `ref` is delved into in further detail in the [memory management document](MEMORY_MANAGEMENT.md). ## Advanced Types -The `type` keyword is used to declare custom data types. These are *algebraic*: they function by composition. - -Algebraic data types can be one-of: -- `tuple`: An ordered collection of types. Optionally named. +The `type` keyword is used to declare aliases to custom data types. These types are *algebraic*: they function by composition. Algebraic data types can be one-of: - `struct`: An unordered, named collection of types. May have default values. +- `tuple`: An ordered collection of types. Optionally named. - `enum`: Ordinal labels, that may hold values. Their default values are their ordinality. - `union`: Powerful matchable tagged unions a la Rust. Sum types. -- `interface`: Usage-based typeclasses. User-defined duck typing. -- `distinct`: a type that must be explicitly converted -- type aliases, declared as `type Identifier = Alias` - -### tuples - -Tuples are an *ordered* collection of either named or unnamed types. +- `interface`: Implicit typeclasses. User-defined duck typing. -They are declared with `tuple[Type, identifier: Type, ...]` and initialized with parentheses: `(413, "hello", value: 40000)`. - -They are exclusively ordered - named types within tuples are just syntax sugar for positional access. Passing a fully unnamed tuple into a context that expects a tuple with a named parameter is allowed so long as the types line up in order. - -```puck -``` - -Tuples are particularly useful for "on-the-fly" types. Creating type aliases to tuples is discouraged - structs are generally a better choice for custom type declarations. +There also exist `distinct` types: while `type` declarations define an alias to an existing or new type, `distinct` types define a type that must be explicitly converted to/from. This is useful for having some level of separation from the implicit interfaces that abound. ### structs @@ -211,12 +197,28 @@ prints_data(node) Structs are *structural* and so structs composed entirely of fields with the same signature (identical in name and type) are considered *equivalent*. This is part of a broader structural trend in the type system, and is discussed in detail in [subtyping](#subtyping). +### tuples + +Tuples are an *ordered* collection of either named and/or unnamed types. + +They are declared with `tuple[Type, identifier: Type, ...]` and initialized with parentheses: `(413, "hello", value: 40000)`. Syntax sugar allows for them to be declared with `()` as well. + +They are exclusively ordered - named types within tuples are just syntax sugar for positional access. Passing a fully unnamed tuple into a context that expects a tuple with a named parameter is allowed so long as the types line up in order. + +```puck +let grouping = (1, 2, 3) + +func foo: tuple[string, string] = ("hello", "world") +``` + +Tuples are particularly useful for "on-the-fly" types. Creating type aliases to tuples is discouraged - structs are generally a better choice for custom type declarations. + ### enums -Enums are *ordinal labels* that may have associated values. +Enums are *ordinal labels* that may have *associated values*. They are declared with `enum[Label, AnotherLabel = 4, ...]` and are never initialized (their values are known statically). -Enums may be accessed directly by their label, and are ordinal and iterable regardless of their associated value. +Enums may be accessed directly by their label, and are ordinal and iterable regardless of their associated value. They are useful in collecting large numbers of "magic values", that would otherwise be constants. ```puck type Keys = enum @@ -225,59 +227,63 @@ type Keys = enum B = "b" ``` -In the case of an identifier conflict (with other enum labels, or types, or...) they must be prefixed with the name of their associated type (separated by a dot). (this is standard for identifier conflicts: and is discussed in more detail in the [modules document](MODULES.md).) +In the case of an identifier conflict (with other enum labels, or types, or...) they must be prefixed with the name of their associated type (separated by a dot). This is standard for identifier conflicts: and is discussed in more detail in the [modules document](MODULES.md). ### unions -Unions are *tagged* type unions. They provide a high-level wrapper over an inner type that must be accessed via pattern matching. +Unions are *tagged* type unions. They provide a high-level wrapper over an inner type that must be safely accessed via pattern matching. -They are declared with `union[Variant: Type, ...]` and initialized with the name of a variant followed by its inner type constructor in brackets: `Square(side: 5)`. Tuple and struct types are special-cased to eliminate extraneous parentheses. +They are declared with `union[Variant(Type), ...]` and initialized with the name of a variant followed by its inner type constructor in brackets: `Square(side: 5)`. Tuples and structs are special-cased to eliminate extraneous parentheses. ```puck -type Value = uint64 -type Ident = string +type Value = u64 +type Ident = str type Expr = ref union - Literal: Value - Variable: Ident - Abstraction: struct[param: Ident, body: Expr] - Application: struct[body, arg: Expr] - Conditional: struct - cond, then_case, else_case: Expr + Literal(Value) + Variable(Ident) + Abstraction(param: Ident, body: Expr) + Application(body: Expr, arg: Expr) + Conditional( + condition: Expr + then_case: Expr + else_case: Expr + ) ``` -They take up as much space in memory as the largest variant, plus the size of the tag (typically one byte). +They take up as much space in memory as the largest variant, plus the size of the tag (one byte). #### pattern matching Unions abstract over differing types. In order to *safely* be used, their inner types must be accessed via *pattern matching*: leaving no room for type confusion. Pattern matching in Puck relies on two syntactic constructs: the `match` statement, forcing qualification and handling of all possible types of a variable, and the `of` statement, querying type equality while simultaneously binding new identifiers to underspecified portions of variables. ```puck -import std/tables +use std.tables -func eval(expr: Expr, context: var HashTable[Ident, Value]): Result[Value] +func eval(context: mut HashTable[Ident, Value], expr: Expr): Result[Value] match expr - of Literal(value): value - of Variable(ident): context.get(ident) - of Application{body, arg}: - if body of Abstraction{param, body as inner_body}: - context.set(param, eval(arg)) - inner_body.eval(context) + of Literal(value): Okay(value) + of Variable(ident): + context.get(ident).err("Variable not in context") + of Application(body, arg): + if body of Abstraction(param, body as inner_body): + context.set(param, context.eval(arg)?) # from std.tables + context.eval(inner_body) else: - Error(InvalidExpr) - of Conditional{cond, then_case, else_case}: - if eval(cond, context): - then_case.eval(context) + Error("Expected Abstraction, found {}".fmt(body)) + of Conditional(condition, then_case, else_case): + if context.eval(condition)? == "true": + context.eval(then_case) else: - else_case.eval(context) - of _: - Error(InvalidExpr) + context.eval(else_case) + of expr: + Error("Invalid expression {}".fmt(expr)) ``` -The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity; but the `variable of Type(binding)` syntax can be reused as a conditional, in `if` statements and elsewhere. Each branch of a match expression can have a *guard*: an arbitrary conditional that must be met in order for it to match. Guards are written as `where cond` and immediately follow the last pattern in an `of` branch, preceding the colon. +The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity. The `expr of Type(binding)` syntax can be reused as a conditional, in `if` statements and elsewhere. The `of` *operator* is similar to the `is` operator in that it queries type equality, returning a boolean. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. This allows for succinct handling of `union` types in situations where `match` is overkill. -`match` expressions and `of` operators have some special rules. When matching unions with an inner product type (structs or tuples), external extraneous parenthesis are elided. todo: others? +Each branch of a match expression can also have a *guard*: an arbitrary conditional that must be met in order for it to match. Guards are written as `where cond` and immediately follow the last pattern in an `of` branch, preceding the colon. ### interfaces @@ -287,44 +293,44 @@ The `interface` type is composed of a list of function signatures that refer to ```puck type Stack[T] = interface - func push(self: var Self, val: T) - func pop(self: var Self): T - func peek(self: Self): T + push(self: mut Self, val: T) + pop(self: mut Self): T + peek(self: Self): T func takes_any_stack(stack: Stack[int]) = # only stack.push, stack.pop, and stack.peek are available methods ``` -Differing from Rust, Haskell, and many others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this reasonable. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to explicitly group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. +Differing from Rust, Haskell, and many others, there is no explicit `impl` block. If there exist functions for a type that satisfy all of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this a more reasonable design. The purpose of explicit `impl` blocks in ex. Rust is three-fold: to provide a limited form of uniform function call syntax; to explicitly group together associated code; and to disambiguate. UFCS provides for the first, the module system provides for the second, and the third is proposed to not matter. -Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling an interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. +Interfaces cannot be constructed because they are **unsized**. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling an interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. -Interfaces compose with [modules](MODULES.md) to offer fine grained access control. +Interfaces also *cannot* extend or rely upon other interfaces in any way. There is no concept of an interface extending an interface. There is no concept of a parameter satisfying two interfaces. In the author's experience, while such constructions are powerful, they are also an immense source of complexity, leading to less-than-useful interface hierarchies seen in languages like Java, and yes, Rust. -todo: I have not decided whether the names of parameters is / should be relevant, or enforcable, or present. I'm leaning towards them not being present. But if they are enforcable, it makes it harder to implicitly implement the wrong interface. Design notes to consider: https://blog.rust-lang.org/2015/05/11/traits.html - -### generic types +Instead, if one wishes to form an interface that *also* satisfies another interface, they must include all of the other interface's associated functions within the new interface. Given that interfaces overwhelmingly only have a handful of associated functions, and if you're using more than one interface you *really* should be using a concrete type, the hope is that this will provide explicitness. -Types, like functions, can be *generic*: defined for an unknown arbitrary type, monomorphized at compile time. Indeed, we have already seen them before: in the above interface example. The syntax much follows the syntax for generic functions. +<!-- While functions are the primary way of performing operations on types, they are not the only way, and listing all explicitly can be painful - instead, it can be desired to be able to *associate a type* and any field access or existing functions on that type with the interface. todo: i have not decided on the syntax for this yet. --> -```puck -``` - -### distinct types +Interfaces compose with [modules](MODULES.md) to offer fine grained access control. -Puck is a structurally typed language. If you have `type Foo = struct[a: int]` and `type Bar = struct[a, b: int]`, passing a term of type `Bar` into a context that expects `Foo` is allowed: even encouraged. This usage-based paradigm is a consistent underlying theme of Puck. +<!-- todo: I have not decided whether the names of parameters is / should be relevant, or enforcable, or present. I'm leaning towards them not being present. But if they are enforcable, it makes it harder to implicitly implement the wrong interface. Design notes to consider: https://blog.rust-lang.org/2015/05/11/traits.html --> -`distinct` types break this coercion convention. They are primarily useful for extended (type,\_) safety: a `type SqlStr = distinct str` provides an extra barrier to SQL injections, for example, or a `type Point = distinct struct[x, y: int]` can block confusion with a `type Rectangle = distinct struct[x, y: int]`. +### type aliases and distinct types -`distinct` types can still be *explicitly* converted to an equivalent type with `as`. +Any type can be declared as an *alias* to a type simply by assigning it to such. All functions defined on the original type carry over, and functions expecting one type may receive the other with no issues. -### type aliases +```puck +type Float = float +``` -Finally, any type can be declared as an *alias* to a type simply by assigning it to such. +It is no more than an alias. When explicit conversion between types is desired and functions carrying over is undesired, `distinct` types may be used. +```puck +type MyFloat = distinct float +let foo: MyFloat = MyFloat(192.68) ``` -type MyFloat = float -``` + +Types then must be explicitly converted via constructors. ## Errata @@ -339,19 +345,18 @@ But always explicitly initializing types is syntactically verbose, and so most t - `float`, etc: `0.0` - `char`: `'\0'` - `str`: `""` -- `void`, `never`: n/a +- `void`, `never`: unconstructable - `array[T]`, `list[T]`: `[]` - `set[T]`, `table[T, U]`: `{}` - `tuple[T, U, ...]`: `(default values of its fields)` - `struct[T, U, ...]`: `{default values of its fields}` -- `enum[One, Two, ...]`: **disallowed** +- `enum[One, Two, ...]`: `<first label>` - `union[T, U, ...]`: **disallowed** - `slice[T]`, `func`: **disallowed** - `ref`, `ptr`: **disallowed** -For `ref` and `ptr` types, however, this is trickier. There is no reasonable "default" for these types *aside from* null. -Instead of giving in, the compiler will instead disallow any non-initializations or other cases in which a default value would be inserted. -(`slice[T]` and `func` types are references under the hood and also behave in this way) +For unions, slices, references, and pointers, this is a bit trickier. They all have no reasonable "default" for these types *aside from* null. +Instead of giving in, the compiler instead disallows any non-initializations or other cases in which a default value would be inserted. todo: consider user-defined defaults (ex. structs) @@ -360,13 +365,13 @@ todo: consider user-defined defaults (ex. structs) Mention of subtyping has been on occasion in contexts surrounding structural type systems, particularly the section on distinct types, but no explicit description of what the subtyping rules are have been given. Subtyping is the implicit conversion of compatible types, usually in a one-way direction. The following types are implicitly convertible: -- `uint ==> int` -- `int ==> float` -- `uint ==> float` -- `string ==> list[char]` (the opposite no, use `pack`) -- `array[T; n] ==> list[T]` -- `struct[a: T, b: U, ...] ==> struct[a: T, b: U]` -- `union[A: T, B: U] ==> union[A: T, B: U, ...]` +- `uint` ==> `int` +- `int` ==> `float` +- `uint` ==> `float` +- `string` ==> `list[char]` (the opposite no, use `pack`) +- `array[T; n]` ==> `list[T]` +- `struct[a: T, b: U, ...]` ==> `struct[a: T, b: U]` +- `union[A: T, B: U]` ==> `union[A: T, B: U, ...]` ### inheritance |