From c911dc4ac89d07ec2af44fd8e30c6ceb5562ab47 Mon Sep 17 00:00:00 2001 From: JJ Date: Tue, 11 Jul 2023 21:32:29 -0700 Subject: move docs into docs folder and update the readme --- ASYNC.md | 17 ------ BASIC.md | 120 ------------------------------------------- ERRORS.md | 24 --------- README.md | 33 +++++++----- TYPES.md | 151 ------------------------------------------------------ docs/ASYNC.md | 17 ++++++ docs/BASIC.md | 120 +++++++++++++++++++++++++++++++++++++++++++ docs/ERRORS.md | 24 +++++++++ docs/TYPES.md | 159 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 9 files changed, 339 insertions(+), 326 deletions(-) delete mode 100644 ASYNC.md delete mode 100644 BASIC.md delete mode 100644 ERRORS.md delete mode 100644 TYPES.md create mode 100644 docs/ASYNC.md create mode 100644 docs/BASIC.md create mode 100644 docs/ERRORS.md create mode 100644 docs/TYPES.md diff --git a/ASYNC.md b/ASYNC.md deleted file mode 100644 index 5b9fa7e..0000000 --- a/ASYNC.md +++ /dev/null @@ -1,17 +0,0 @@ -# Asynchronous Programming - -I don't know enough about asynchronous programming to get started with this section. - -Existing systems to learn from: -- https://github.com/status-im/nim-chronos -- https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ -- https://tokio.rs/tokio/tutorial -- https://morestina.net/blog/1686/rust-async-is-colored -- https://ziglearn.org/chapter-5/ -- https://kristoff.it/blog/zig-colorblind-async-await/ -- https://en.wikipedia.org/wiki/Async/await -- https://old.reddit.com/r/elixir/np688d/ - -Asynchronous programming is hard to design and hard to use. Even Rust doesn't do a great job. It *shouldn't* need built-in language support - we should be able to encode it as a type and provide any special syntax via macros. Note that async is not just threading! threading is solved well by Rust's rayon and Go's (blugh) goroutines. - -Is async worth having separate from effects? diff --git a/BASIC.md b/BASIC.md deleted file mode 100644 index 47c13c8..0000000 --- a/BASIC.md +++ /dev/null @@ -1,120 +0,0 @@ -# Basic Usage of Puck - -```puck -``` - -Mutable variables are declared with `var`. -Immutable variables are declared with `let`. -Compile-time evaluated immutable variables are declared with `const`. - -Comments are declared with `#`. -Documentation comments are declared with `##`. -Multi-line comments are declared with `#[ ]#` and may be nested. - -Type annotations on variable declarations follow the name with `: Type` and are typically optional. The compiler is quite capable of variable inference. - -The type system is comprehensive, and complex enough to [warrant its own document](TYPES.md). - -```puck -``` - -Functions are declared with the `func` keyword, followed by the function name, followed by an (optional) list of parameters surrounded in parenthesis, followed by a type annotation. Functions may be prefixed with one or more of the following modifiers: -- `pub`: exports the function for use by external files -- `pure`: denotes a function as a "pure function", lacking side effects, i.e. IO or nondeterminism or parameter mutability -- `yeet`: denotes a function as a "throwing function", meaning it may raise exceptions. -- `async`: marks a function as asynchronous which may only be called by other asynchronous functions or with the `await` keyword - - - - - -A list of parameters, surrounded by parentheses and separated by commas, may follow the function name. These are optional and a function with no parameters may be followed with `()` or simply nothing at all. More information on function parameters (and return types) is available in the [type system overview](TYPES.md). - -Type annotations on function declarations follow the name and parameters (if any) with `: Type` and are typically required. The compiler is not particularly capable of function type inference (and it is good practice to annotate them anyway). - -Uniform function call syntax (UFCS) is supported: and so arbitrary functions with compatible types may be chained with no more than the `.` operator. - - - -```puck -``` - -Boolean logic and integer operations are standard and as one would expect out of a typed language: `and`, `or`, `xor`, `not`, `shl`, `shr`, `+`, `-`, `*`, `/`, `<`, `>`, `<=`, `>=`, `div`, `mod`, `rem`. Notably: -- the words `and`/`or`/`not`/`shl`/`shr` are used instead of the symbolic `&&`/`||`/`!`/`<<`/`>>` -- integer division is expressed with the keyword `div` while floating point division uses `/` -- `%` is absent and replaced with distinct modulus and remainder operators -- boolean operators are bitwise and also apply to integers and floats -- more operators are available via the standard library - -Term in/equality is expressed with `==` and `!=`. Type in/equality is expressed with `is` and `isnot` (more on this in the [types document](TYPES.md)). Set logic is expressed with `in` and `notin`, and is applicable to not just sets but collections of any sort. - -String concatenation uses `&` rather than overloading the `+` operator (as the complement `-` has no natural meaning for strings). Strings are also unified and mutable. More details can be found in the [type system overview](TYPES.md). - -```puck -``` - -Basic conditional control flow is standard via `if`, `elif`, and `else` statements. - -There is a distinction between statements, which do not produce a value but rather only execute computations, and expressions, which evaluate to a value. Several control flow constructs - conditionals, block statements, and pattern matches - may be used as both statements and expressions. - -The special `discard` statement allows for throwing an expression's value away. On its own, it provides a no-op. All (non-void) expressions must be handled: however, a non-discarded expression at the end of a scope functions as an implicit return. This allows for significant syntactic reduction. - -```puck -``` - -Three types of loops are available: `while` loops, `for` loops, and infinite loops (`loop` loops). While loops take a condition that is executed upon the beginning of each iteration to determine whether to keep looping. For loops take a binding (which may be structural, see pattern matching) and an iterable object and will loop until the iterable object is spent. Infinite loops are, well, infinite and must be manually broken out of. - -There is no special concept of iterators: iterable objects are any object that implements the Iterable interface (more on those in [the type system document](TYPES.md)), that is, provides a `self.next()` function returning an Optional type. For loops desugar to while loops that unwrap the result of the `next()` function and end iteration upon a `None` value. While loops, in turn, desugar to infinite loops with an explicit conditional break. - -The `break` keyword immediately breaks out of the current loop. -The `continue` keyword immediately jumps to the next iteration of the current loop. -Loops may be used in conjunction with blocks for more fine-grained control flow manipulation. - -```puck -``` - -Blocks provide arbitrary scope manipulation. They may be labelled or unlabelled. The `break` keyword additionally functions inside of blocks and without any parameters will jump out of the current enclosing block (or loop). It may also take a block label as a parameter for fine-grained scope control. - -All forms of control flow ultimately desugar to continuations: https://github.com/nim-works/cps/tree/master/docs - -```puck -``` - -Exhaustive structural pattern matching is available, particularly useful for tagged unions, and discussed in detail in the [types document](TYPES.md). This is frequently a better alternative to a series of `if` statements. - -```puck -``` - -I am undecided on how the import/module system will work and particularly how it will play into the type system. UFCS *will* be supported. todo - -```puck -``` - -Compile-time programming may be done via the previously-mentioned `const` keyword: or via `static` blocks. All code within a `static` block is evaluated at compile-time and all assignments made are propagated to the compiled binary. As a result, `static` blocks are only available in the global context (not within functions). - -Compile-time programming may also be intertwined in the codebase with the use of the `when` statement. It functions similarly to `if`, but may only take a static operation as its parameter, and will directly replace code accordingly at compile-time. The `else` statement is overloaded to complement this. - -```puck -``` - -Metaprogramming is done via compile-time introspection on the abstract syntax tree. -Two distinct language constructs of differing complexity are provided: templates for raw substitution, and macros for direct manipulation of the abstract syntax tree. These are complex, and more details may be found in the [metaprogramming document](METAPROGRAMMING.md). - -```puck -``` - -Error handling is typically done via explicitly matching upon Optional and Result values (with the help of the `?` operator), but such functions can be made to explicitly throw exceptions (which may then be caught via `try`/`catch`/`finally` or thrown with `raise`) with the help of the `!` operator. This is complex and necessarily verbose, although a bevy of helper functions and syntactic sugar are available to ease the pain. More details may be found in [error handling overview](ERRORS.md). - -```puck -``` - -Threading support is complex and regulated to external libraries (with native syntax via macros). OS-provided primitives will likely provide a `spawn` function, and there will be substantial restrictions for memory safety. I haven't thought much about this. - -Async support is complex and relegated to external libraries (with native syntax via macros). More details may be found in the [async document](ASYNC.md). It is likely that this will look like Zig, with `async`/`await`/`suspend`/`resume`. - -Effects are complex and relegated to external libraries (with native syntax via macros). More details may be found in the [effects document](EFFECTS.md). - -```puck -``` - -Details on memory safety, references and pointers, and deep optimizations may be found in the [memory management overview](MEMORY_MANAGEMENT.md). This intertwines deeply with the [type system](TYPES.md). diff --git a/ERRORS.md b/ERRORS.md deleted file mode 100644 index 4a4b206..0000000 --- a/ERRORS.md +++ /dev/null @@ -1,24 +0,0 @@ -# Error Handling - -Error handling should perhaps be abstracted into a more general effects system. -But if not, then this document lays out some potential ideas. - ---- - -```puck -``` - -Puck provides [`Option[T]`](std/default/options.pk) and a [`Result[T, E]`](std/default/results.pk) types, imported by default. These are `union` types and so must be pattern matched upon to be useful: but the standard library provides a bevy of helper functions. - -Two in particular are of note. The `?` operator unwraps a Result or propagates its error up a function call. The `!` operator unwraps an Option or Result directly or throws an exception in the case of None or Error. - -```puck -``` - -Errors raised by the `!` operator must be explicitly caught and handled via a `try/catch/finally` statement. - -If an exception is not handled within a function body, the function must be explicitly marked as a throwing function via the `yeet` prefix (final name to be determined). The compiler will statically determine which exceptions in particular are thrown from any given function. - -This creates a distinction between two types of error handling, working in sync: functional error handling with [Option](https://en.wikipedia.org/wiki/Option_type) and [Result](https://en.wikipedia.org/wiki/Result_type) types, and object-oriented error handling with [nullable types](https://en.wikipedia.org/wiki/Nullable_type) and [exceptions](https://en.wikipedia.org/wiki/Exception_handling). These styles may be swapped between with minimal syntax overhead. Libraries, however, should universally use Options and Results, as this provides best for both usages. - -References: [std/options](std/default/options.pk), [std/results](std/default/results.pk), [Error Handling in Swift](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/errorhandling) (shamelessly stolen) diff --git a/README.md b/README.md index cce1336..67080fd 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,13 @@ A place where I can make some bad decisions. Puck is an experimental, memory safe, strongly typed, multi-paradigm programming language. -It aims to be clean and succinct while performant: having the ease of use of [Python](https://www.python.org/) with the performance/safety guarantees of [Rust](https://www.rust-lang.org/) and the flexibility/metaprogramming of [Nim](https://nim-lang.org/). +It aims to be clean and succinct while performant: having the flexibility/metaprogramming of [Nim](https://nim-lang.org/) with the performance/safety guarantees of [Rust](https://www.rust-lang.org/) and the error handling of [Swift](https://www.swift.org/). You may judge for yourself if Puck meets these ideals. +```puck +``` + ## Why Puck? Puck is primarily a testing ground and should not be used in any important capacity. @@ -20,23 +23,25 @@ That said: in the future, once somewhat stabilized, reasons why you *would* use - The **interop system**, allowing foreign functions to be usable with native semantics from a bevy of popular languages +This is the language I keep in my head. It sprung from a series of unstructured notes I kept on language design, that finally became something more comprehensive in early 2023. The overarching goal is to provide a language capable of elegantly expressing any problem, and explore ownership and interop along the way. + ## How do I learn more? -- The [basic usage](BASIC.md) document lays out the fundamental grammar of Puck. -- The [syntax](SYNTAX.md) document provides a deeper and formal look into the syntax choices made. -- The [type system](TYPES.md) document gives an in-depth analysis of Puck's extensive type system. -- The [memory management](MEMORY_MANAGEMENT.md) document gives an overview of Puck's memory model. -- The [metaprogramming](METAPROGRAMMING.md) document explains how using metaprogramming to extend the language works. -- The [asynchronous](ASYNC.md) document gives an overview of the intertwining of Puck's asynchronous support with other language features. -- The [effect system](EFFECTS.md) document gives a description of how Puck's effect handler system works. -- The [interop](INTEROP.md) document gives an overview of how the first-class language interop system works. -- The [modules](MODULES.md) document provides a more detailed look at imports and how they relate to the type system. -- The [standard library](STDLIB.md) document provides an overview and examples of usage of the standard library. -- The [roadmap](ROADMAP.md) provides a clear view of the current state and future plans of the language's development. +- The [basic usage](docs/BASIC.md) document lays out the fundamental grammar of Puck. +- The [syntax](docs/SYNTAX.md) document provides a deeper and formal look into the syntax choices made. +- The [type system](docs/TYPES.md) document gives an in-depth analysis of Puck's extensive type system. +- The [memory management](docs/MEMORY_MANAGEMENT.md) document gives an overview of Puck's memory model. +- The [metaprogramming](docs/METAPROGRAMMING.md) document explains how using metaprogramming to extend the language works. +- The [asynchronous](docs/ASYNC.md) document gives an overview of the intertwining of Puck's asynchronous support with other language features. +- The [interop](docs/INTEROP.md) document gives an overview of how the first-class language interop system works. +- The [modules](docs/MODULES.md) document provides a more detailed look at imports and how they relate to the type system. + +- The [standard library](docs/STDLIB.md) document provides an overview and examples of usage of the standard library. +- The [roadmap](docs/ROADMAP.md) provides a clear view of the current state and future plans of the language's development. These are best read in order. -Note that all of these documents (and parts of this README) are written as if everything already exists. Nothing already exists! You can see the [roadmap](ROADMAP.md) for an actual sense as to the state of the language. I simply found writing in the present tense to be an easier way to collect my thoughts. +Note that all of these documents (and parts of this README) are written as if everything already exists. Nothing already exists! You can see the [roadmap](docs/ROADMAP.md) for an actual sense as to the state of the language. I simply found writing in the present tense to be an easier way to collect my thoughts. ## Acknowledgements @@ -44,4 +49,4 @@ First and foremost, this language is *heavily* inspired by Nim. Many ideas - gen The error handling model, and purity system, were essentially directly lifted from Swift (and to an extent, Nim). The underlying type system is mostly copied from Rust, with significant changes to the interface (trait) and module system. The memory model is based upon similar successful models in Lobster, Nim, and Rust. Performance annotations are somewhat inspired by Nim. -The effects system is unique, with inspiration from the few languages successfully implementing effects systems, namely Koka and Unison. + diff --git a/TYPES.md b/TYPES.md deleted file mode 100644 index 6950f78..0000000 --- a/TYPES.md +++ /dev/null @@ -1,151 +0,0 @@ -# Typing in Puck - -Puck has a comprehensive static type system. - -## Basic types - -Basic types can be one-of: -- `bool`: internally an enum. -- `int`: integer number. x bits of precision by default. - - `uint`: unsigned integer for more precision. - - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size - - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size -- `float`: floating-point number. - - `f32`, `f64`: specified float sizes -- `char`: a distinct 0-127 character. For working with ascii. -- `rune`: a Unicode character. -- `str`: a string type. mutable. internally a char-array? must also support unicode. -- `void`: an internal type designating the absence of a value. - - possibly, the empty tuple. then would `empty` be better? -- `never`: a type that denotes functions that do not return. distinct from returning nothing. - - the bottom type. - -`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`). - -Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: -- boolean types: `bool` -- numeric types: `int`, `float`, and siblings -- textual types: `char`, `rune`, `str` -- funky types: `void`, `never` - -Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations: like with [[concepts]]. - -## Function types - -Functions can also be types. -- `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V. - - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s. Is this a good idea? should i universally use `:` or `->`? - -## Container types - -Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]]. - -### Iterable types - -Iterable types can be one-of: -- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. - - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`. -- `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically. - - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`. -- `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. - - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). - - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. -- `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} - -All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. -For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim: but hopefully a bit more powerful? -- Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this... -- Aside: `iterable` may need a better name. it implies iterators right now which it is distinctly Unrelated to. unless i don't have iterators? that may be the way to go... - -Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`. -- Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient - -### Abstract data types - -There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided. - -Abstract data types can be one-of: -- `set[T]`: high-performance sets implemented as a bit array. - - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead. -- `table[T, U]`: simple symbol tables implemented as an association list. - - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead. -- `HashSet[T]`: standard hash sets. -- `HashTable[T, U]`: standard hash tables. - -Unlike iterable types, abstract data types are not iterable by default: as they are not ordered, and thus, it is not clear how they should be iterated. Despite this: for utility purposes, an `elems()` iterator based on a normalization of the elements is provided for `set` and `HashSet`, and `keys()`, `values()`, and `pairs()` iterators are provided for `table` and `HashTable` based on a normalization of the keys. This is deterministic to prevent user reliance on shoddy randomization, see Golang. - -## Parameter types - -Some types are only valid when being passed to a function, or in similar contexts. -No variables may be assigned these types, nor may any function return them. -These are monomorphized into more specific functions at compile-time if needed. - -Parameter types can be one-of: -- generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. -- constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. - - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). -- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. - - Passed as a `ref` if not one already, and marked mutable. -- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]]. - - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length. - -### Generic types - -Functions can take a _generic_ type, that is, be defined for a number of types at once: - -``` -func add[T](a: list[T], b: T) = - return a.add(b) - -func length[T](a: T) = - return a.len # monomorphizes based on usage. - # lots of things use .len, but only a few called by this do. - # throws a warning if exported for lack of specitivity. - -func length[T: str | list](a: T) = - return a.len -``` - -The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). -Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. - -Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. - -## Reference types - -Types are typically constructed as value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another would not be allowed. However, Puck provides two avenues for indirection. - -Reference types can be one-of: -- `ref T`: An automatically-managed reference to type `T`. -- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. - -In addition, `var T` may somewhat be considered a reference type as it may implicitly create a `ref` for mutability if the type is not already `ref`: but it is only applicable on parameters. - -``` -type Node = ref struct - left: Node - right: Node - -type AnotherNode = struct - left: ref AnotherNode - right: ref AnotherNode - -type BinaryTree = ref struct - left: BinaryTree - right: BinaryTree -``` - -The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. -Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). -These types are delved into in further detail in the section on memory management. - -## Advanced Types - -The `type` keyword is used to declare custom types. - -Custom types can be one-of: -- `tuple`: An ordered collection of types. Optionally named. -- `struct`: An unordered, named collection of types. May have default values. -- `enum`: Powerful algebraic data types a la Rust. -- `concept`: typeclasses. they have some unique syntax -- `distinct`: a type that must be explicitly converted diff --git a/docs/ASYNC.md b/docs/ASYNC.md new file mode 100644 index 0000000..5b9fa7e --- /dev/null +++ b/docs/ASYNC.md @@ -0,0 +1,17 @@ +# Asynchronous Programming + +I don't know enough about asynchronous programming to get started with this section. + +Existing systems to learn from: +- https://github.com/status-im/nim-chronos +- https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ +- https://tokio.rs/tokio/tutorial +- https://morestina.net/blog/1686/rust-async-is-colored +- https://ziglearn.org/chapter-5/ +- https://kristoff.it/blog/zig-colorblind-async-await/ +- https://en.wikipedia.org/wiki/Async/await +- https://old.reddit.com/r/elixir/np688d/ + +Asynchronous programming is hard to design and hard to use. Even Rust doesn't do a great job. It *shouldn't* need built-in language support - we should be able to encode it as a type and provide any special syntax via macros. Note that async is not just threading! threading is solved well by Rust's rayon and Go's (blugh) goroutines. + +Is async worth having separate from effects? diff --git a/docs/BASIC.md b/docs/BASIC.md new file mode 100644 index 0000000..e266b29 --- /dev/null +++ b/docs/BASIC.md @@ -0,0 +1,120 @@ +# Basic Usage of Puck + +```puck +``` + +Mutable variables are declared with `var`. +Immutable variables are declared with `let`. +Compile-time evaluated immutable variables are declared with `const`. + +Comments are declared with `#`. +Documentation comments are declared with `##`. +Multi-line comments are declared with `#[ ]#` and may be nested. + +Type annotations on variable declarations follow the name with `: Type` and are typically optional. The compiler is quite capable of variable inference. + +The type system is comprehensive, and complex enough to [warrant its own document](TYPES.md). + +```puck +``` + +Functions are declared with the `func` keyword, followed by the function name, followed by an (optional) list of parameters surrounded in parenthesis, followed by a type annotation. Functions may be prefixed with one or more of the following modifiers: +- `pub`: exports the function for use by external files +- `pure`: denotes a function as a "pure function", lacking side effects, i.e. IO or nondeterminism or parameter mutability +- `yeet`: denotes a function as a "throwing function", meaning it may raise exceptions. +- `async`: marks a function as asynchronous which may only be called by other asynchronous functions or with the `await` keyword + + + + + +A list of parameters, surrounded by parentheses and separated by commas, may follow the function name. These are optional and a function with no parameters may be followed with `()` or simply nothing at all. More information on function parameters (and return types) is available in the [type system overview](TYPES.md). + +Type annotations on function declarations follow the name and parameters (if any) with `: Type` and are typically required. The compiler is not particularly capable of function type inference (and it is good practice to annotate them anyway). + +Uniform function call syntax (UFCS) is supported: and so arbitrary functions with compatible types may be chained with no more than the `.` operator. + + + +```puck +``` + +Boolean logic and integer operations are standard and as one would expect out of a typed language: `and`, `or`, `xor`, `not`, `shl`, `shr`, `+`, `-`, `*`, `/`, `<`, `>`, `<=`, `>=`, `div`, `mod`, `rem`. Notably: +- the words `and`/`or`/`not`/`shl`/`shr` are used instead of the symbolic `&&`/`||`/`!`/`<<`/`>>` +- integer division is expressed with the keyword `div` while floating point division uses `/` +- `%` is absent and replaced with distinct modulus and remainder operators +- boolean operators are bitwise and also apply to integers and floats +- more operators are available via the standard library + +Term in/equality is expressed with `==` and `!=`. Type in/equality is expressed with `is` and `isnot` (more on this in the [types document](TYPES.md)). Set logic is expressed with `in` and `notin`, and is applicable to not just sets but collections of any sort. + +String concatenation uses `&` rather than overloading the `+` operator (as the complement `-` has no natural meaning for strings). Strings are also unified and mutable. More details can be found in the [type system overview](TYPES.md). + +```puck +``` + +Basic conditional control flow is standard via `if`, `elif`, and `else` statements. + +There is a distinction between statements, which do not produce a value but rather only execute computations, and expressions, which evaluate to a value. Several control flow constructs - conditionals, block statements, and pattern matches - may be used as both statements and expressions. + +The special `discard` statement allows for throwing an expression's value away. On its own, it provides a no-op. All (non-void) expressions must be handled: however, a non-discarded expression at the end of a scope functions as an implicit return. This allows for significant syntactic reduction. + +```puck +``` + +Three types of loops are available: `while` loops, `for` loops, and infinite loops (`loop` loops). While loops take a condition that is executed upon the beginning of each iteration to determine whether to keep looping. For loops take a binding (which may be structural, see pattern matching) and an iterable object and will loop until the iterable object is spent. Infinite loops are, well, infinite and must be manually broken out of. + +There is no special concept of iterators: iterable objects are any object that implements the Iterable interface (more on those in [the type system document](TYPES.md)), that is, provides a `self.next()` function returning an Optional type. For loops desugar to while loops that unwrap the result of the `next()` function and end iteration upon a `None` value. While loops, in turn, desugar to infinite loops with an explicit conditional break. + +The `break` keyword immediately breaks out of the current loop. +The `continue` keyword immediately jumps to the next iteration of the current loop. +Loops may be used in conjunction with blocks for more fine-grained control flow manipulation. + +```puck +``` + +Blocks provide arbitrary scope manipulation. They may be labelled or unlabelled. The `break` keyword additionally functions inside of blocks and without any parameters will jump out of the current enclosing block (or loop). It may also take a block label as a parameter for fine-grained scope control. + +All forms of control flow ultimately desugar to continuations: https://github.com/nim-works/cps/tree/master/docs + +```puck +``` + +Exhaustive structural pattern matching is available and particularly useful for tagged unions. This is frequently a better alternative to a series of `if` statements. + +```puck +``` + +I am undecided on how the import/module system will work and particularly how it will play into the type system. UFCS *will* be supported. todo + +More details may be found in the [modules document](MODULES.md). + +```puck +``` + +Compile-time programming may be done via the previously-mentioned `const` keyword: or via `static` blocks. All code within a `static` block is evaluated at compile-time and all assignments made are propagated to the compiled binary. As a result, `static` blocks are only available in the global context (not within functions). + +Compile-time programming may also be intertwined in the codebase with the use of the `when` statement. It functions similarly to `if`, but may only take a static operation as its parameter, and will directly replace code accordingly at compile-time. The `else` statement is overloaded to complement this. + +Further compile-time programming may be done via metaprogramming: compile-time introspection on the abstract syntax tree. +Two distinct language constructs of differing complexity are provided: templates for raw substitution, and macros for direct manipulation of the abstract syntax tree. These are complex, and more details may be found in the [metaprogramming document](METAPROGRAMMING.md). + +```puck +``` + +Error handling is typically done via explicitly matching upon Optional and Result values (with the help of the `?` operator), but such functions can be made to explicitly throw exceptions (which may then be caught via `try`/`catch`/`finally` or thrown with `raise`) with the help of the `!` operator. This is complex and necessarily verbose, although a bevy of helper functions and syntactic sugar are available to ease usage. More details may be found in [error handling overview](ERRORS.md). + +```puck +``` + +Threading support is complex and regulated to external libraries (with native syntax via macros). OS-provided primitives will likely provide a `spawn` function, and there will be substantial restrictions for memory safety. I haven't thought much about this. + +Async support is complex and relegated to external libraries (with native syntax via macros). More details may be found in the [async document](ASYNC.md). It is likely that this will look like Zig, with `async`/`await`/`suspend`/`resume`. + +Effects are complex and relegated to external libraries (with native syntax via macros). More details may be found in the [effects document](EFFECTS.md). + +```puck +``` + +Details on memory safety, references and pointers, and deep optimizations may be found in the [memory management overview](MEMORY_MANAGEMENT.md). +The memory model intertwines deeply with the type system. diff --git a/docs/ERRORS.md b/docs/ERRORS.md new file mode 100644 index 0000000..4a4b206 --- /dev/null +++ b/docs/ERRORS.md @@ -0,0 +1,24 @@ +# Error Handling + +Error handling should perhaps be abstracted into a more general effects system. +But if not, then this document lays out some potential ideas. + +--- + +```puck +``` + +Puck provides [`Option[T]`](std/default/options.pk) and a [`Result[T, E]`](std/default/results.pk) types, imported by default. These are `union` types and so must be pattern matched upon to be useful: but the standard library provides a bevy of helper functions. + +Two in particular are of note. The `?` operator unwraps a Result or propagates its error up a function call. The `!` operator unwraps an Option or Result directly or throws an exception in the case of None or Error. + +```puck +``` + +Errors raised by the `!` operator must be explicitly caught and handled via a `try/catch/finally` statement. + +If an exception is not handled within a function body, the function must be explicitly marked as a throwing function via the `yeet` prefix (final name to be determined). The compiler will statically determine which exceptions in particular are thrown from any given function. + +This creates a distinction between two types of error handling, working in sync: functional error handling with [Option](https://en.wikipedia.org/wiki/Option_type) and [Result](https://en.wikipedia.org/wiki/Result_type) types, and object-oriented error handling with [nullable types](https://en.wikipedia.org/wiki/Nullable_type) and [exceptions](https://en.wikipedia.org/wiki/Exception_handling). These styles may be swapped between with minimal syntax overhead. Libraries, however, should universally use Options and Results, as this provides best for both usages. + +References: [std/options](std/default/options.pk), [std/results](std/default/results.pk), [Error Handling in Swift](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/errorhandling) (shamelessly stolen) diff --git a/docs/TYPES.md b/docs/TYPES.md new file mode 100644 index 0000000..68c02d0 --- /dev/null +++ b/docs/TYPES.md @@ -0,0 +1,159 @@ +# Typing in Puck + +Puck has a comprehensive static type system. + +## Basic types + +Basic types can be one-of: +- `bool`: internally an enum. +- `int`: integer number. x bits of precision by default. + - `uint`: unsigned integer for more precision. + - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size + - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size +- `float`: floating-point number. + - `f32`, `f64`: specified float sizes +- `char`: a distinct 0-127 character. For working with ascii. +- `rune`: a Unicode character. +- `str`: a string type. mutable. internally a char-array? must also support unicode. +- `void`: an internal type designating the absence of a value. + - possibly, the empty tuple. then would `empty` be better? or `unit`? +- `never`: a type that denotes functions that do not return. + - distinct from returning nothing. + - the bottom type. + +`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`). + +Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: +- boolean types: `bool` +- numeric types: `int`, `float`, and siblings +- textual types: `char`, `rune`, `str` +- funky types: `void`, `never` + +Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations. + +## Function types + +Functions can also be types. +- `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V. + - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s. + - purity of functions? + +## Container types + +Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]]. + +### Iterable types + +Iterable types can be one-of: +- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. + - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`. +- `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically. + - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`. +- `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. + - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). + - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. +- `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} + +All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. +For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim. +- Under the hood, this is an interface. +- Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this... +- Aside: does `slice` fill this role? +- todo. many questions abound. + +Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`. +- Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient + +### Abstract data types + +There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided. + +Abstract data types can be one-of: +- `set[T]`: high-performance sets implemented as a bit array. + - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead. +- `table[T, U]`: simple symbol tables implemented as an association list. + - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead. +- `HashSet[T]`: standard hash sets. +- `HashTable[T, U]`: standard hash tables. + +Unlike iterable types, abstract data types are not iterable by default: as they are not ordered, and thus, it is not clear how they should be iterated. Despite this: for utility purposes, an `elems()` iterator based on a normalization of the elements is provided for `set` and `HashSet`, and `keys()`, `values()`, and `pairs()` iterators are provided for `table` and `HashTable` based on a normalization of the keys. This is deterministic to prevent user reliance on shoddy randomization, see Golang. + +## Parameter types + +Some types are only valid when being passed to a function, or in similar contexts. +No variables may be assigned these types, nor may any function return them. +These are monomorphized into more specific functions at compile-time if needed. + +Parameter types can be one-of: +- generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. +- constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. + - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). +- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. + - Passed as a `ref` if not one already, and marked mutable. +- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]]. + - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length. + +### Generic types + +Functions can take a _generic_ type, that is, be defined for a number of types at once: + +``` +func add[T](a: list[T], b: T) = + return a.add(b) + +func length[T](a: T) = + return a.len # monomorphizes based on usage. + # lots of things use .len, but only a few called by this do. + # throws a warning if exported for lack of specitivity. + +func length[T: str | list](a: T) = + return a.len +``` + +The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). +Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. + +Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. + +## Reference types + +Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another would not be allowed. However, Puck provides two avenues for indirection. + +Reference types can be one-of: +- `ref T`: An automatically-managed reference to type `T`. +- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. + +In addition, `var T` may somewhat be considered a reference type as it may implicitly create a `ref` for mutability if the type is not already `ref`: but it is only applicable on parameters. + +``` +type Node = ref struct + left: Node + right: Node + +type AnotherNode = struct + left: ref AnotherNode + right: ref AnotherNode + +type BinaryTree = ref struct + left: BinaryTree + right: BinaryTree +``` + +The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. +Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). + +These types are delved into in further detail in the section on memory management. +The indirection that `ref` types provide is explored a little further in the section in this document on interfaces. + +## Advanced Types + +The `type` keyword is used to declare custom data types. These are *algebraic*: they function by composition. + +Algebraic data types can be one-of: +- `tuple`: An ordered collection of types. Optionally named. +- `struct`: An unordered, named collection of types. May have default values. +- `enum`: Ordinal labels, that may hold values. Their default values are their ordinality. +- `union`: Powerful matchable tagged unions a la Rust. Sum types. +- `interface`: Usage-based typeclasses. User-defined duck typing. +- `distinct`: a type that must be explicitly converted +- type aliases, declared as `type Identifier = Alias` -- cgit v1.2.3-70-g09d2