aboutsummaryrefslogtreecommitdiff
path: root/docs/TYPES.md
diff options
context:
space:
mode:
authorJJ2023-07-12 04:32:29 +0000
committerJJ2023-07-12 04:46:35 +0000
commitc911dc4ac89d07ec2af44fd8e30c6ceb5562ab47 (patch)
treee56017442b596fc13a6126ee074c1d260e25cd5c /docs/TYPES.md
parent16a548e9c3d0e20486463bac9719840b886d9f8b (diff)
move docs into docs folder and update the readme
Diffstat (limited to 'docs/TYPES.md')
-rw-r--r--docs/TYPES.md159
1 files changed, 159 insertions, 0 deletions
diff --git a/docs/TYPES.md b/docs/TYPES.md
new file mode 100644
index 0000000..68c02d0
--- /dev/null
+++ b/docs/TYPES.md
@@ -0,0 +1,159 @@
+# Typing in Puck
+
+Puck has a comprehensive static type system.
+
+## Basic types
+
+Basic types can be one-of:
+- `bool`: internally an enum.
+- `int`: integer number. x bits of precision by default.
+ - `uint`: unsigned integer for more precision.
+ - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size
+ - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size
+- `float`: floating-point number.
+ - `f32`, `f64`: specified float sizes
+- `char`: a distinct 0-127 character. For working with ascii.
+- `rune`: a Unicode character.
+- `str`: a string type. mutable. internally a char-array? must also support unicode.
+- `void`: an internal type designating the absence of a value.
+ - possibly, the empty tuple. then would `empty` be better? or `unit`?
+- `never`: a type that denotes functions that do not return.
+ - distinct from returning nothing.
+ - the bottom type.
+
+`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`).
+
+Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories:
+- boolean types: `bool`
+- numeric types: `int`, `float`, and siblings
+- textual types: `char`, `rune`, `str`
+- funky types: `void`, `never`
+
+Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations.
+
+## Function types
+
+Functions can also be types.
+- `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V.
+ - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s.
+ - purity of functions?
+
+## Container types
+
+Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]].
+
+### Iterable types
+
+Iterable types can be one-of:
+- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink.
+ - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`.
+- `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically.
+ - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`.
+- `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`.
+ - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later).
+ - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference.
+- `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided}
+
+All of these above types are some sort of sequence: and so have a length, and so can be _iterated_.
+For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim.
+- Under the hood, this is an interface.
+- Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this...
+- Aside: does `slice` fill this role?
+- todo. many questions abound.
+
+Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`.
+- Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient
+
+### Abstract data types
+
+There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided.
+
+Abstract data types can be one-of:
+- `set[T]`: high-performance sets implemented as a bit array.
+ - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead.
+- `table[T, U]`: simple symbol tables implemented as an association list.
+ - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead.
+- `HashSet[T]`: standard hash sets.
+- `HashTable[T, U]`: standard hash tables.
+
+Unlike iterable types, abstract data types are not iterable by default: as they are not ordered, and thus, it is not clear how they should be iterated. Despite this: for utility purposes, an `elems()` iterator based on a normalization of the elements is provided for `set` and `HashSet`, and `keys()`, `values()`, and `pairs()` iterators are provided for `table` and `HashTable` based on a normalization of the keys. This is deterministic to prevent user reliance on shoddy randomization, see Golang.
+
+## Parameter types
+
+Some types are only valid when being passed to a function, or in similar contexts.
+No variables may be assigned these types, nor may any function return them.
+These are monomorphized into more specific functions at compile-time if needed.
+
+Parameter types can be one-of:
+- generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage.
+- constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization.
+ - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings).
+- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default.
+ - Passed as a `ref` if not one already, and marked mutable.
+- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]].
+ - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length.
+
+### Generic types
+
+Functions can take a _generic_ type, that is, be defined for a number of types at once:
+
+```
+func add[T](a: list[T], b: T) =
+ return a.add(b)
+
+func length[T](a: T) =
+ return a.len # monomorphizes based on usage.
+ # lots of things use .len, but only a few called by this do.
+ # throws a warning if exported for lack of specitivity.
+
+func length[T: str | list](a: T) =
+ return a.len
+```
+
+The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types).
+Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body.
+
+Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters.
+
+## Reference types
+
+Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another would not be allowed. However, Puck provides two avenues for indirection.
+
+Reference types can be one-of:
+- `ref T`: An automatically-managed reference to type `T`.
+- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe.
+
+In addition, `var T` may somewhat be considered a reference type as it may implicitly create a `ref` for mutability if the type is not already `ref`: but it is only applicable on parameters.
+
+```
+type Node = ref struct
+ left: Node
+ right: Node
+
+type AnotherNode = struct
+ left: ref AnotherNode
+ right: ref AnotherNode
+
+type BinaryTree = ref struct
+ left: BinaryTree
+ right: BinaryTree
+```
+
+The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed.
+Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features).
+
+These types are delved into in further detail in the section on memory management.
+The indirection that `ref` types provide is explored a little further in the section in this document on interfaces.
+
+## Advanced Types
+
+The `type` keyword is used to declare custom data types. These are *algebraic*: they function by composition.
+
+Algebraic data types can be one-of:
+- `tuple`: An ordered collection of types. Optionally named.
+- `struct`: An unordered, named collection of types. May have default values.
+- `enum`: Ordinal labels, that may hold values. Their default values are their ordinality.
+- `union`: Powerful matchable tagged unions a la Rust. Sum types.
+- `interface`: Usage-based typeclasses. User-defined duck typing.
+- `distinct`: a type that must be explicitly converted
+- type aliases, declared as `type Identifier = Alias`