From 4a3796c1ff54b44b1493f3afed18d0dd71b5f19f Mon Sep 17 00:00:00 2001 From: JJ Date: Wed, 12 Jul 2023 10:04:49 -0700 Subject: begin discussion of advanced types and some minor changes --- docs/TYPES.md | 171 +++++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 145 insertions(+), 26 deletions(-) diff --git a/docs/TYPES.md b/docs/TYPES.md index 68c02d0..727482f 100644 --- a/docs/TYPES.md +++ b/docs/TYPES.md @@ -21,7 +21,7 @@ Basic types can be one-of: - distinct from returning nothing. - the bottom type. -`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`). +`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ copied (unless passed as `var`). More on when parameters are passed by value (copied) vs. passed by reference can be found in the [memory management document](MEMORY_MANAGEMENT.md). Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: - boolean types: `bool` @@ -31,16 +31,9 @@ Basic types as a whole include the primitive types, as well as `str`, `void`, an Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations. -## Function types - -Functions can also be types. -- `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V. - - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s. - - purity of functions? - ## Container types -Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]]. +Container types, broadly speaking, are types that contain other types. These exclude the types in [advanced types](#advanced-types). ### Iterable types @@ -55,14 +48,10 @@ Iterable types can be one-of: - `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. -For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim. -- Under the hood, this is an interface. -- Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this... -- Aside: does `slice` fill this role? -- todo. many questions abound. +For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection of a single type with a length. +- Is this redundant with `slice[T]`? todo Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`. -- Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient ### Abstract data types @@ -85,19 +74,28 @@ No variables may be assigned these types, nor may any function return them. These are monomorphized into more specific functions at compile-time if needed. Parameter types can be one-of: +- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. + - Passed as a `ref` if not one already, and marked mutable. +- static: `func foo(a: static str)`: Denotes a parameter that's value must be known at compile-time. Useful with `when` for writing generic code. - generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. - constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). -- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. - - Passed as a `ref` if not one already, and marked mutable. -- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]]. +- functions: `func foo(a: func (x, y: int): int)`: First-class functions. + - Functions may be prefixed with modifiers: one-of `pure`, `yeet`, `async`. + - Syntactic sugar is available: `func foo(a: (int, int) -> int)`. This is not usable with modifiers. +- interfaces: `func foo(a: Stack[int])`: User-defined duck typing. Equivalent to and more general than Rust's traits or Haskell's typeclasses. + - ex. for above: the `Stack[T]` interface specifies generic `push` and `pop` operations + - More on interfaces in the [interfaces section](#interfaces). +- built-in interface: `func foo(a: enum)`: Included, special interfaces for being generic over [advanced types](#advanced-types). - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length. +These parameter types (except `static`) share a common trait: they are not *sized*. The exact type is not generally known until compilation - and in the case of interfaces, sometimes not even during compilation! As the size is not always rigorously known, problems arise when attempting to construct parameter types or compose them with other types: and so this is disallowed. Not all is lost, however, as they may still be used with *indirection* - detailed in the [section on reference types](#reference-types). + ### Generic types Functions can take a _generic_ type, that is, be defined for a number of types at once: -``` +```puck func add[T](a: list[T], b: T) = return a.add(b) @@ -117,15 +115,13 @@ Constrained generics have two syntaxes: the constraint can be directly on a para ## Reference types -Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another would not be allowed. However, Puck provides two avenues for indirection. +Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve parameter types, would not be allowed. However, Puck provides two avenues for indirection. Reference types can be one-of: -- `ref T`: An automatically-managed reference to type `T`. +- `ref T`: An automatically-managed reference to type `T`. This is a pointer of size `uint` (native). - `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. -In addition, `var T` may somewhat be considered a reference type as it may implicitly create a `ref` for mutability if the type is not already `ref`: but it is only applicable on parameters. - -``` +```puck type Node = ref struct left: Node right: Node @@ -139,11 +135,12 @@ type BinaryTree = ref struct right: BinaryTree ``` +The `ref` prefix may be placed at the top level of type declarations, or inside on a field of a structural type. `ref` types may often be more efficient when dealing with large data structures. They also provide for the usage of parameter types (except `static`, `var`, and standard generics) within type declarations. + The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). -These types are delved into in further detail in the section on memory management. -The indirection that `ref` types provide is explored a little further in the section in this document on interfaces. +The implementation of reference types is delved into in further detail in the [document on memory management](MEMORY_MANAGEMENT.md). ## Advanced Types @@ -157,3 +154,125 @@ Algebraic data types can be one-of: - `interface`: Usage-based typeclasses. User-defined duck typing. - `distinct`: a type that must be explicitly converted - type aliases, declared as `type Identifier = Alias` + +### tuples + +Tuples are an *ordered* collection of either named or unnamed types. + +They are declared with `tuple[Type, identifier: Type, ...]` and initialized with parentheses: `(413, "hello", value: 40000)`. + +```puck +``` + +Tuples are particularly useful for "on-the-fly" types. +Structs are generally a better choice for custom type declarations. + +### structs + +Structs are an *unordered* collection of named types. + +They are declared with `struct[identifier: Type, ...]` and initialized with their name, followed by their fields in brackets: `Struct(field: "value", another: 500)`. + +```puck +``` + +Structs are *structural* and so structs composed entirely of fields with the same signature (identical in name and type) are considered *equivalent*. +This is part of a broader structural trend in the type system, and is discussed in detail in [subtyping](#subtyping). + +### enums + +Enums are *ordinal labels* that may have associated values. + +They are declared with `enum[Label, AnotherLabel = 4, ...]` and are never initialized (their values are known statically). +Enums may be accessed directly by their label, and are ordinal and iterable regardless of their associated value. + +In the case of an identifier conflict (with other enum labels, or types, or...) they must be prefixed with the name of their associated type (separated by a dot). (this is standard for identifier conflicts: and is discussed in more detail in the [modules document](MODULES.md).) + +### unions + +Unions are *tagged* type unions. They provide a high-level wrapper over an inner type that must be accessed via pattern matching. + +They are declared with `union[Variant: Type, ...]` and initialized with the name of a variant followed by its inner type constructor in brackets: `Square(side: 5)`. Tuple and struct types are special-cased to eliminate extraneous parentheses. + +```puck +``` + +They take up as much space in memory as the largest variant, plus the size of the tag (typically one byte). + +#### pattern matching + +todo... + +```puck +``` + +The `of` operator is similar to the `is` operator in that it queries type equality. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. + +### interfaces + +todo... + +### distinct types + +todo... + +### type aliases + +Finally, any type can be declared as an *alias* to a type simply by assigning it to such. + + +## Errata + +### composition of types + +Algebraic data types are, well, *algebraic*, and we've mentioned that they function by composition but have yet to show an extended example of what this means for the end user. + +```puck +``` + +Note the existence of *anonymous* algebraic types: structs, tuples, unions, even interfaces may be nested within other algebraic data types and declared without an identifier. todo... + +### default values + +Puck does not have any concept of `null`: all values *must* be initialized. +But always explicitly initializing types is loud, and so most types have an associated "default value". + +**Default values**: +- `bool`: `false` +- `int`, `uint`, etc: `0` +- `float`, etc: `0.0` +- `char`: `char(0)` +- `rune`: `rune(0)` +- `str`: `""` +- `void`, `never`: n/a +- `array[T]`, `list[T]`: `[]`/`@[]` +- `set[T]`, `table[T, U]`: `{}` +- `tuple[T, U, ...]`: `(default values of its fields)` +- `struct[T, U, ...]`: `Struct(default values of its fields)` +- `enum[One, Two, ...]`: `One` (the first label) + - todo: is this correct? +- `union[T, U, ...]`: todo +- `slice[T]`, `func`: **disallowed** +- `ref`, `ptr`: **disallowed** + +For `ref` and `ptr` types, however, this is trickier. There is no reasonable "default" for these types *aside from* null. +Instead of giving in, the compiler will instead disallow any non-initializations or other cases in which a default value would be inserted. +(`slice[T]` and `func` types are references under the hood and also behave in this way) + +### subtyping + +todo... + +### inheritance + +Puck is not an object-oriented language. Idiomatic design patterns in object-oriented languages will be harder to accomplish and not quite as idiomatic. + +But, Puck has a number of features that somewhat support the object-oriented paradigm, including: +- uniform function call syntax +- structural typing / subtyping +- interfaces + +```puck +``` + +These features may *compose* into code that closely resembles its object-oriented counterpart. But make no mistake! Puck is static first and functional somewhere in there: dynamic dispatch and the like are not accessible (currently). -- cgit v1.2.3-70-g09d2