From d28ddab6ac5591705b4b87a408e68f14ad9e82d8 Mon Sep 17 00:00:00 2001 From: JJ Date: Tue, 15 Aug 2023 13:52:59 -0700 Subject: docs: interfaces, distinct types, subtyping, and cleanups --- docs/TYPES.md | 198 ++++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 151 insertions(+), 47 deletions(-) diff --git a/docs/TYPES.md b/docs/TYPES.md index 727482f..59cfd2f 100644 --- a/docs/TYPES.md +++ b/docs/TYPES.md @@ -7,29 +7,33 @@ Puck has a comprehensive static type system. Basic types can be one-of: - `bool`: internally an enum. - `int`: integer number. x bits of precision by default. - - `uint`: unsigned integer for more precision. + - `uint`: same as `int`, but unsigned for more precision. - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size + - overflows into bigints for safety and ease of cryptographical code. - `float`: floating-point number. - `f32`, `f64`: specified float sizes -- `char`: a distinct 0-127 character. For working with ascii. -- `rune`: a Unicode character. -- `str`: a string type. mutable. internally a char-array? must also support unicode. +- `byte`: an alias to `u8`. +- `char`: a *four-byte* value. For working with Unicode. An alias to `u32`. + - these are *packed* when part of a string: and so indexing directly into a string is a no-op. string access is O(n), swift-style. +- `str`: a string type. mutable. internally a byte-array: externally a char-array. - `void`: an internal type designating the absence of a value. - possibly, the empty tuple. then would `empty` be better? or `unit`? - `never`: a type that denotes functions that do not return. - distinct from returning nothing. - the bottom type. -`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ copied (unless passed as `var`). More on when parameters are passed by value (copied) vs. passed by reference can be found in the [memory management document](MEMORY_MANAGEMENT.md). +`bool`, `int`/`uint` and siblings, `float` and siblings, `byte`, and `char` are all considered **primitive types** and are _always_ copied (unless passed as `var`). More on when parameters are passed by value (copied) vs. passed by reference can be found in the [memory management document](MEMORY_MANAGEMENT.md). -Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: -- boolean types: `bool` -- numeric types: `int`, `float`, and siblings -- textual types: `char`, `rune`, `str` -- funky types: `void`, `never` +Primitive types combine with `str`, `void`, and `never` to form the **basic types**. `void` and `never` will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations. -Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations. +### integers + +todo + +### strings + +todo ## Container types @@ -45,7 +49,7 @@ Iterable types can be one-of: - `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. -- `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} +- `str`: Strings. Complicated, they are alternatively `list[byte]` or `list[char]` depending on who's asking. In the context of iterable types, they are treated as `list[char]`. All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection of a single type with a length. @@ -115,30 +119,35 @@ Constrained generics have two syntaxes: the constraint can be directly on a para ## Reference types -Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve parameter types, would not be allowed. However, Puck provides two avenues for indirection. +Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve unsized types (notably including parameter types and interfaces!), would not be allowed. However, Puck provides two avenues for indirection. Reference types can be one-of: - `ref T`: An automatically-managed reference to type `T`. This is a pointer of size `uint` (native). - `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. ```puck -type Node = ref struct - left: Node - right: Node - -type AnotherNode = struct - left: ref AnotherNode - right: ref AnotherNode - type BinaryTree = ref struct left: BinaryTree right: BinaryTree + +type AbstractTree[T] = interface + func left(self: Self): Option[AbstractTree[T]] + func right(self: Self): Option[AbstractTree[T]] + func data(self: Self): T + +type AbstractRoot[T] = struct + left: ref AbstractTree[T] + right: ref AbstractTree[T] + +# allowed, but unsafe & strongly discouraged +type UnsafeTree = struct + left: ptr UnsafeTree + right: ptr UnsafeTree ``` -The `ref` prefix may be placed at the top level of type declarations, or inside on a field of a structural type. `ref` types may often be more efficient when dealing with large data structures. They also provide for the usage of parameter types (except `static`, `var`, and standard generics) within type declarations. +The `ref` prefix may be placed at the top level of type declarations, or inside on a field of a structural type. `ref` types may often be more efficient when dealing with large data structures. They also provide for the usage of parameter types (except for `static` and `var`) within type declarations. -The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. -Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). +The compiler abstracts over `ref` types to provide optimization for reference counts: and so neither a distinction between `Rc`/`Arc`/`Box`, nor a `*` dereference operator is needed. Much care has been given to make references efficient and safe, and so `ptr` should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features). The implementation of reference types is delved into in further detail in the [document on memory management](MEMORY_MANAGEMENT.md). @@ -164,16 +173,31 @@ They are declared with `tuple[Type, identifier: Type, ...]` and initialized with ```puck ``` -Tuples are particularly useful for "on-the-fly" types. -Structs are generally a better choice for custom type declarations. +Tuples are particularly useful for "on-the-fly" types. Creating type aliases to tuples is discouraged - structs are generally a better choice for custom type declarations. ### structs Structs are an *unordered* collection of named types. -They are declared with `struct[identifier: Type, ...]` and initialized with their name, followed by their fields in brackets: `Struct(field: "value", another: 500)`. +They are declared with `struct[identifier: Type, ...]` and initialized with brackets: `{field: "value", another: 500}`. ```puck +type LinkedNode[T] = struct + previous, next: Option[ref LinkedNode[T]] + data: T + +let node = { + previous, next = None + data = 413 +} + +func pretty_print(node: LinkedNode[int]) = + print node.data + if node.next of Some(node): + node.pretty_print() + +# structural typing! +prints_data(node) ``` Structs are *structural* and so structs composed entirely of fields with the same signature (identical in name and type) are considered *equivalent*. @@ -186,6 +210,13 @@ Enums are *ordinal labels* that may have associated values. They are declared with `enum[Label, AnotherLabel = 4, ...]` and are never initialized (their values are known statically). Enums may be accessed directly by their label, and are ordinal and iterable regardless of their associated value. +```puck +type Keys = enum + Left, Right, Up, Down + A = "a" + B = "b" +``` + In the case of an identifier conflict (with other enum labels, or types, or...) they must be prefixed with the name of their associated type (separated by a dot). (this is standard for identifier conflicts: and is discussed in more detail in the [modules document](MODULES.md).) ### unions @@ -195,6 +226,14 @@ Unions are *tagged* type unions. They provide a high-level wrapper over an inner They are declared with `union[Variant: Type, ...]` and initialized with the name of a variant followed by its inner type constructor in brackets: `Square(side: 5)`. Tuple and struct types are special-cased to eliminate extraneous parentheses. ```puck +type Ident = string +type Expr = union + Literal: Term + Variable: Ident + Abstraction: struct[param: Ident, func: Expr] + Application: struct[func, arg: Expr] + Conditional: struct + if, then, else: Expr ``` They take up as much space in memory as the largest variant, plus the size of the tag (typically one byte). @@ -210,48 +249,68 @@ The `of` operator is similar to the `is` operator in that it queries type equali ### interfaces -todo... +Interfaces can be thought of as analogous to Rust's traits, without explicit `impl` blocks and without need for the `derive` macro. -### distinct types +The `interface` type is composed of a list of function signatures that refer to the special type `Self` that must exist for a type to be valid. +Interfaces cannot be constructed and so are only of use as parameter types and the like, and so are always used in a *type conversion*. +The special type `Self` is replaced with the type being converted at compile time in order to typecheck. -todo... +They are declared with `interface[signature, ...]`. -### type aliases +```puck +type Stack[T] = interface + func push(self: var Self, val: T) + func pop(self: var Self): T + func peek(self: Self): T -Finally, any type can be declared as an *alias* to a type simply by assigning it to such. +func takes_any_stack(stack: Stack[int]) = + # only stack.push, stack.pop, and stack.peek are available methods +``` +Differing from Rust, Haskell, and others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this fine and idiomatic. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. -## Errata +Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling the interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. -### composition of types +While functions are the primary way of performing operations on types, they are not the only way, and listing all explicitly can be painful - instead, it can be desired to be able to *associate a type* and any field access or existing functions on that type with the interface. todo: i have not decided on the syntax for this yet. -Algebraic data types are, well, *algebraic*, and we've mentioned that they function by composition but have yet to show an extended example of what this means for the end user. +Interfaces also compose with [the module system](MODULES.md) to offer fine grained crafting of types. -```puck +### distinct types + +Puck is a structurally typed language. If you have `type Foo = struct[a: int]` and `type Bar = struct[a, b: int]`, passing a term of type `Bar` into a context that expects `Foo` is allowed: even encouraged. This usage-based paradigm is a consistent underlying theme of Puck. + +`distinct` types break this coercion convention. They are primarily useful for extended (type,\_) safety: a `type SqlStr = distinct str` provides an extra barrier to SQL injections, for example, or a `type Point = distinct struct[x, y: int]` can block confusion with a `type Rectangle = distinct struct[x, y: int]`. + +`distinct` types can still be *explicitly* converted to an equivalent type with `as`. + +### type aliases + +Finally, any type can be declared as an *alias* to a type simply by assigning it to such. + +``` +type MyFloat = float ``` -Note the existence of *anonymous* algebraic types: structs, tuples, unions, even interfaces may be nested within other algebraic data types and declared without an identifier. todo... +## Errata ### default values Puck does not have any concept of `null`: all values *must* be initialized. -But always explicitly initializing types is loud, and so most types have an associated "default value". +But always explicitly initializing types is syntactically verbose, and so most types have an associated "default value". **Default values**: - `bool`: `false` - `int`, `uint`, etc: `0` - `float`, etc: `0.0` -- `char`: `char(0)` -- `rune`: `rune(0)` +- `char`: `'\0'` - `str`: `""` - `void`, `never`: n/a -- `array[T]`, `list[T]`: `[]`/`@[]` +- `array[T]`, `list[T]`: `[]` - `set[T]`, `table[T, U]`: `{}` - `tuple[T, U, ...]`: `(default values of its fields)` -- `struct[T, U, ...]`: `Struct(default values of its fields)` -- `enum[One, Two, ...]`: `One` (the first label) - - todo: is this correct? -- `union[T, U, ...]`: todo +- `struct[T, U, ...]`: `{default values of its fields}` +- `enum[One, Two, ...]`: **disallowed** +- `union[T, U, ...]`: **disallowed** - `slice[T]`, `func`: **disallowed** - `ref`, `ptr`: **disallowed** @@ -259,13 +318,24 @@ For `ref` and `ptr` types, however, this is trickier. There is no reasonable "de Instead of giving in, the compiler will instead disallow any non-initializations or other cases in which a default value would be inserted. (`slice[T]` and `func` types are references under the hood and also behave in this way) +todo: consider user-defined defaults (ex. structs) + ### subtyping -todo... +Mention of subtyping has been on occasion in contexts surrounding structural type systems, particularly the section on distinct types, but no explicit description of what the subtyping rules are have been given. + +Subtyping is the implicit conversion of compatible types, usually in a one-way direction. The following types are implicitly convertible: +- `uint ==> int` +- `int ==> float` +- `uint ==> float` +- `string ==> list[char]` (the opposite no, use `pack`) +- `array[T; n] ==> list[T]` +- `struct[a: T, b: U, ...] ==> struct[a: T, b: U]` +- `union[A: T, B: U] ==> union[A: T, B: U, ...]` ### inheritance -Puck is not an object-oriented language. Idiomatic design patterns in object-oriented languages will be harder to accomplish and not quite as idiomatic. +Puck is not an object-oriented language. Idiomatic design patterns in object-oriented languages are harder to accomplish and not idiomatic here. But, Puck has a number of features that somewhat support the object-oriented paradigm, including: - uniform function call syntax @@ -273,6 +343,40 @@ But, Puck has a number of features that somewhat support the object-oriented par - interfaces ```puck +type Building = struct + size: struct[length, width: uint] + color: enum[Red, Blue, Green] + location: tuple[longitude, latitude: float] + +type House = struct + size: struct[length, width: uint] + color: enum[Red, Blue, Green] + location: tuple[longitude, latitude: float] + occupant: str + +func init(_: type[House]): House = + { size: {length, width: 500}, color: Red + location: (0.0, 0.0), occupant: "Barry" } + +func address(building: Building): str = + let number = int(building.location.0 / building.location.1).abs + let street = "Logan Lane" + return number.str & " " & street + +# subtyping! methods! +print House.init().address() + +func address(house: House): str = + let number = int(house.location.0 - house.location.1).abs + let street = "Logan Lane" + return number.str & " " & street + +# overriding! (will warn) +print address(House.init()) + +# abstract types! inheritance! +type Addressable = interface for Building + func address(self: Self) ``` These features may *compose* into code that closely resembles its object-oriented counterpart. But make no mistake! Puck is static first and functional somewhere in there: dynamic dispatch and the like are not accessible (currently). -- cgit v1.2.3-70-g09d2