diff options
author | JJ | 2023-07-12 04:32:29 +0000 |
---|---|---|
committer | JJ | 2023-07-12 04:46:35 +0000 |
commit | c911dc4ac89d07ec2af44fd8e30c6ceb5562ab47 (patch) | |
tree | e56017442b596fc13a6126ee074c1d260e25cd5c /docs/TYPES.md | |
parent | 16a548e9c3d0e20486463bac9719840b886d9f8b (diff) |
move docs into docs folder and update the readme
Diffstat (limited to 'docs/TYPES.md')
-rw-r--r-- | docs/TYPES.md | 159 |
1 files changed, 159 insertions, 0 deletions
diff --git a/docs/TYPES.md b/docs/TYPES.md new file mode 100644 index 0000000..68c02d0 --- /dev/null +++ b/docs/TYPES.md @@ -0,0 +1,159 @@ +# Typing in Puck + +Puck has a comprehensive static type system. + +## Basic types + +Basic types can be one-of: +- `bool`: internally an enum. +- `int`: integer number. x bits of precision by default. + - `uint`: unsigned integer for more precision. + - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size + - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size +- `float`: floating-point number. + - `f32`, `f64`: specified float sizes +- `char`: a distinct 0-127 character. For working with ascii. +- `rune`: a Unicode character. +- `str`: a string type. mutable. internally a char-array? must also support unicode. +- `void`: an internal type designating the absence of a value. + - possibly, the empty tuple. then would `empty` be better? or `unit`? +- `never`: a type that denotes functions that do not return. + - distinct from returning nothing. + - the bottom type. + +`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`). + +Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: +- boolean types: `bool` +- numeric types: `int`, `float`, and siblings +- textual types: `char`, `rune`, `str` +- funky types: `void`, `never` + +Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations. + +## Function types + +Functions can also be types. +- `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V. + - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s. + - purity of functions? + +## Container types + +Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]]. + +### Iterable types + +Iterable types can be one-of: +- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. + - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`. +- `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically. + - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`. +- `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. + - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). + - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. +- `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} + +All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. +For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim. +- Under the hood, this is an interface. +- Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this... +- Aside: does `slice` fill this role? +- todo. many questions abound. + +Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`. +- Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient + +### Abstract data types + +There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided. + +Abstract data types can be one-of: +- `set[T]`: high-performance sets implemented as a bit array. + - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead. +- `table[T, U]`: simple symbol tables implemented as an association list. + - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead. +- `HashSet[T]`: standard hash sets. +- `HashTable[T, U]`: standard hash tables. + +Unlike iterable types, abstract data types are not iterable by default: as they are not ordered, and thus, it is not clear how they should be iterated. Despite this: for utility purposes, an `elems()` iterator based on a normalization of the elements is provided for `set` and `HashSet`, and `keys()`, `values()`, and `pairs()` iterators are provided for `table` and `HashTable` based on a normalization of the keys. This is deterministic to prevent user reliance on shoddy randomization, see Golang. + +## Parameter types + +Some types are only valid when being passed to a function, or in similar contexts. +No variables may be assigned these types, nor may any function return them. +These are monomorphized into more specific functions at compile-time if needed. + +Parameter types can be one-of: +- generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. +- constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. + - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). +- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. + - Passed as a `ref` if not one already, and marked mutable. +- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]]. + - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length. + +### Generic types + +Functions can take a _generic_ type, that is, be defined for a number of types at once: + +``` +func add[T](a: list[T], b: T) = + return a.add(b) + +func length[T](a: T) = + return a.len # monomorphizes based on usage. + # lots of things use .len, but only a few called by this do. + # throws a warning if exported for lack of specitivity. + +func length[T: str | list](a: T) = + return a.len +``` + +The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). +Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. + +Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. + +## Reference types + +Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another would not be allowed. However, Puck provides two avenues for indirection. + +Reference types can be one-of: +- `ref T`: An automatically-managed reference to type `T`. +- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. + +In addition, `var T` may somewhat be considered a reference type as it may implicitly create a `ref` for mutability if the type is not already `ref`: but it is only applicable on parameters. + +``` +type Node = ref struct + left: Node + right: Node + +type AnotherNode = struct + left: ref AnotherNode + right: ref AnotherNode + +type BinaryTree = ref struct + left: BinaryTree + right: BinaryTree +``` + +The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. +Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). + +These types are delved into in further detail in the section on memory management. +The indirection that `ref` types provide is explored a little further in the section in this document on interfaces. + +## Advanced Types + +The `type` keyword is used to declare custom data types. These are *algebraic*: they function by composition. + +Algebraic data types can be one-of: +- `tuple`: An ordered collection of types. Optionally named. +- `struct`: An unordered, named collection of types. May have default values. +- `enum`: Ordinal labels, that may hold values. Their default values are their ordinality. +- `union`: Powerful matchable tagged unions a la Rust. Sum types. +- `interface`: Usage-based typeclasses. User-defined duck typing. +- `distinct`: a type that must be explicitly converted +- type aliases, declared as `type Identifier = Alias` |