Typing in Puck
! This section needs a rewrite. Proceed with low standards.
Puck has a comprehensive static type system, inspired by the likes of Nim, Rust, and Swift.
Basic types
Basic types can be one-of:
bool
: internally an enum.int
: integer number. x bits of precision by default.uint
: same asint
, but unsigned for more precision.i8
,i16
,i32
,i64
,i128
: specified integer sizeu8
,u16
,u32
,u64
,u128
: specified integer size
float
: floating-point number.f32
,f64
: specified float sizes
decimal
: precision decimal number.dec32
,dec64
,dec128
: specified decimal sizes
byte
: an alias tou8
.char
: a distinct alias tou32
. For working with Unicode.str
: a string type. mutable. internally a byte-array: externally a char-array.void
: an internal type designating the absence of a value. often elided.never
: a type that denotes functions that do not return. distinct from returning nothing.
bool
and int
/uint
/float
and siblings (and subsequently byte
and char
) are all considered primitive types and are always copied (unless passed as mutable). More on when parameters are passed by value vs. passed by reference can be found in the memory management document.
Primitive types combine with str
, void
, and never
to form basic types. void
and never
will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations.
integers
todo
strings
Strings are:
- mutable
- internally a byte array
- externally a char (four bytes) array
- prefixed with their length and capacity
- automatically resize like a list
They are also quite complicated. Puck has full support for Unicode and wishes to be intuitive, performant, and safe, as all languages wish to be. Strings present a problem that much effort has been expended on in (primarily) Swift and Rust to solve.
Abstract Types
Abstract types, broadly speaking, are types described by their behavior rather than their implementation. They are more commonly know as abstract data types: which is confusingly similar to "algebraic data types", another term for the advanced types they are built out of under the hood. We refer to them here as "abstract types" to mitigate some confusion.
iterable types
Iterable types can be one-of:
array[S, T]
: Fixed-size arrays. Can only contain one typeT
. Of a fixed sizeS
and cannot grow/shrink, but can mutate. Initialized in-place with[a, b, c]
.list[T]
: Dynamic arrays. Can only contain one typeT
. May grow/shrink dynamically. Initialized in-place with[a, b, c]
. (this is the same as arrays!)slice[T]
: Slices. Used to represent a "view" into some sequence of elements of typeT
. Cannot be directly constructed: they are unsized. Cannot grow/shrink, but their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they must not outlive the data they reference: this is non-trivial, and so slices interact in complex ways with the memory management system.str
: Strings. Described above. They are alternatively treated as eitherlist[byte]
orlist[char]
, depending on who's asking. Initialized in-place with"abc"
.
These iterable types are commonly used, and bits and pieces of compiler magic are used here and there (mostly around initialization, and ownership) to ease use. All of these types are some sort of sequence: and implement the Iter
interface, and so can be iterated (hence the name).
other abstract types
Unlike the iterable types above, these abstract types do not have a necessarily straightforward or best implementation, and so multiple implementations are provided in the standard library.
These abstract data types can be one-of:
BitSet[T]
: high-performance sets implemented as a bit array.- These have a maximum data size, at which point the compiler will suggest using a
HashSet[T]
instead.
- These have a maximum data size, at which point the compiler will suggest using a
AssocTable[T, U]
: simple symbol tables implemented as an association list.- These do not have a maximum size. However, at some point the compiler will suggest using a
HashTable[T, U]
instead.
- These do not have a maximum size. However, at some point the compiler will suggest using a
HashSet[T]
: standard hash sets.HashTable[T, U]
: standard hash tables.
These abstract types do not have a natural ordering, unlike the iterable types above, and thus do not implement Iter
. Despite this: for utility an elems()
iterator based on a normalization of the elements is provided for set
and HashSet
, and keys()
, values()
, and pairs()
iterators are provided for table
and HashTable
(based on a normalization of the keys).
Parameter Types
Some types are only valid when being passed to a function, or in similar contexts. No variables may be assigned these types, nor may any function return them. These are monomorphized into more specific functions at compile-time if needed.
Parameter types can be one-of:
- mutable:
func foo(a: mut str)
: Marks a parameter as mutable (parameters are immutable by default). Passed as aref
if not one already. - static:
func foo(a: static str)
: Denotes a parameter whose value must be known at compile-time. Useful in macros, and withwhen
for writing generic code. - generic:
func foo[T](a: list[T], b: T)
: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. - constrained:
func foo(a: str | int | float)
: A basic implementation of generics, where a parameter can be one-of several listed types. The only allowed operations on such parameters are those shared by each type. Makes for particularly straightforward monomorphization. - functions:
func foo(a: (int, int) -> int)
: First-class functions. All functions are first class - function declarations implicitly have this type, and may be bound in variable declarations. However, the function type is only terribly useful as a parameter type. - slices:
func foo(a: slice[...])
: Slices of existing lists, strings, and arrays. Generic over length. These are references under the hood, may be either immutable or mutable (withmut
), and interact non-trivially with Puck's ownership system. - interfaces:
func foo(a: Stack[int])
: Implicit typeclasses. More in the interfaces section.- ex. for above:
type Stack[T] = interface[push(mut Self, T); pop(mut Self): T]
- ex. for above:
- built-in interfaces:
func foo(a: struct)
: Included, special interfaces for being generic over advanced types. These includestruct
,tuple
,union
,enum
,interface
, and others.
Several of these parameter types - specifically, slices, functions, and interfaces - share a common trait: they are not sized. The exact size of the type is not generally known until compilation - and in some cases, not even during compilation! As the size is not always rigorously known, problems arise when attempting to construct these parameter types or compose them with other types: and so this is disallowed. They may still be used with indirection, however - detailed in the section on reference types.
generic types
Functions can take a generic type, that is, be defined for a number of types at once:
func add[T](a: list[T], b: T) =
return a.add(b)
func length[T](a: T) =
return a.len # monomorphizes based on usage.
# lots of things use .len, but only a few called by this do.
# throws a warning if exported for lack of specitivity.
func length(a: str | list) =
return a.len
The syntax for generics is func
, ident
, followed by the names of the generic parameters in brackets [T, U, V]
, followed by the function's parameters (which may then refer to the generic types).
Generics are replaced with concrete types at compile time (monomorphization) based on their usage in function calls within the main function body.
Constrained generics have two syntaxes: the constraint can be defined directly on a parameter, leaving off the [T]
box, or it may be defined within the box as [T: int | float]
for easy reuse in the parameters.
Other constructions like modules and type declarations themselves may also be generic.
Reference Types
Types are typically constructed by value on the stack. That is, without any level of indirection: and so type declarations that recursively refer to one another, or involve unsized types (notably including parameter types), would not be allowed. However, Puck provides two avenues for indirection.
Reference types can be one-of:
ref T
: An automatically-managed reference to typeT
. This is a pointer of sizeuint
(native).ptr T
: A manually-managed pointer to typeT
. (very) unsafe. The compiler will yell at you.
type BinaryTree = ref struct
left: BinaryTree
right: BinaryTree
type AbstractTree[T] = interface
func left(self: Self): Option[AbstractTree[T]]
func right(self: Self): Option[AbstractTree[T]]
func data(self: Self): T
type AbstractRoot[T] = struct
left: ref AbstractTree[T]
right: ref AbstractTree[T]
# allowed, but unsafe & strongly discouraged
type UnsafeTree = struct
left: ptr UnsafeTree
right: ptr UnsafeTree
The ref
prefix may be placed at the top level of type declarations, or inside on a field of a structural type. ref
types may often be more efficient when dealing with large data structures. They also provide for the usage of unsized types (functions, interfaces, slices) within type declarations.
The compiler abstracts over ref
types to provide optimization for reference counts: and so a distinction between Rc
/Arc
/Box
is not needed. Furthermore, access implicitly dereferences (with address access available via .addr
), and so a *
dereference operator is also not needed. Much care has been given to make references efficient and safe, and so ptr
should be avoided if at all possible. The compiler will yell at you if you use it (or any other unsafe features).
The implementation of ref
is delved into in further detail in the memory management document.
Advanced Types
The type
keyword is used to declare aliases to custom data types. These types are algebraic: they function by composition. Algebraic data types can be one-of:
struct
: An unordered, named collection of types. May have default values.tuple
: An ordered collection of types. Optionally named.enum
: Ordinal labels, that may hold values. Their default values are their ordinality.union
: Powerful matchable tagged unions a la Rust. Sum types.interface
: Implicit typeclasses. User-defined duck typing.
There also exist distinct
types: while type
declarations define an alias to an existing or new type, distinct
types define a type that must be explicitly converted to/from. This is useful for having some level of separation from the implicit interfaces that abound.
structs
Structs are an unordered collection of named types.
They are declared with struct[identifier: Type, ...]
and initialized with brackets: {field: "value", another: 500}
.
type LinkedNode[T] = struct
previous, next: Option[ref LinkedNode[T]]
data: T
let node = {
previous: None, next: None
data: 413
}
func pretty_print(node: LinkedNode[int]) =
print node.data
if node.next of Some(node):
node.pretty_print()
# structural typing!
prints_data(node)
Structs are structural and so structs composed entirely of fields with the same signature (identical in name and type) are considered equivalent. This is part of a broader structural trend in the type system, and is discussed in detail in the section on subtyping.
tuples
Tuples are an ordered collection of either named and/or unnamed types.
They are declared with tuple[Type, identifier: Type, ...]
and initialized with parentheses: (413, "hello", value: 40000)
. Syntax sugar allows for them to be declared with ()
as well.
They are exclusively ordered - named types within tuples are just syntax sugar for positional access. Passing a fully unnamed tuple into a context that expects a tuple with a named parameter is allowed so long as the types line up in order.
let grouping = (1, 2, 3)
func foo: tuple[string, string] = ("hello", "world")
Tuples are particularly useful for "on-the-fly" types. Creating type aliases to tuples is discouraged - structs are generally a better choice for custom type declarations.
enums
Enums are ordinal labels that may have associated values.
They are declared with enum[Label, AnotherLabel = 4, ...]
and are never initialized (their values are known statically).
Enums may be accessed directly by their label, and are ordinal and iterable regardless of their associated value. They are useful in collecting large numbers of "magic values", that would otherwise be constants.
type Keys = enum
Left, Right, Up, Down
A = "a"
B = "b"
In the case of an identifier conflict (with other enum labels, or types, or...) they must be prefixed with the name of their associated type (separated by a dot). This is standard for identifier conflicts: and is discussed in more detail in the modules document.
unions
Unions are tagged type unions. They provide a high-level wrapper over an inner type that must be safely accessed via pattern matching.
They are declared with union[Variant(Type), ...]
and initialized with the name of a variant followed by its inner type constructor in brackets: Square(side: 5)
. Tuples and structs are special-cased to eliminate extraneous parentheses.
type Value = u64
type Ident = str
type Expr = ref union
Literal(Value)
Variable(Ident)
Abstraction(param: Ident, body: Expr)
Application(body: Expr, arg: Expr)
Conditional(
condition: Expr
then_case: Expr
else_case: Expr
)
They take up as much space in memory as the largest variant, plus the size of the tag (one byte).
pattern matching
Unions abstract over differing types. In order to safely be used, their inner types must be accessed via pattern matching: leaving no room for type confusion. Pattern matching in Puck relies on two syntactic constructs: the match
statement, forcing qualification and handling of all possible types of a variable, and the of
statement, querying type equality while simultaneously binding new identifiers to underspecified portions of variables.
use std.tables
func eval(context: mut HashTable[Ident, Value], expr: Expr): Result[Value]
match expr
of Literal(value): Okay(value)
of Variable(ident):
context.get(ident).err("Variable not in context")
of Application(body, arg):
if body of Abstraction(param, body as inner_body):
context.set(param, context.eval(arg)?) # from std.tables
context.eval(inner_body)
else:
Error("Expected Abstraction, found {}".fmt(body))
of Conditional(condition, then_case, else_case):
if context.eval(condition)? == "true":
context.eval(then_case)
else:
context.eval(else_case)
of expr:
Error("Invalid expression {}".fmt(expr))
The match statement takes exclusively a list of of
sub-expressions, and checks for exhaustivity. The expr of Type(binding)
syntax can be reused as a conditional, in if
statements and elsewhere.
The of
operator is similar to the is
operator in that it queries type equality, returning a boolean. However, unbound identifiers within of
expressions are bound to appropriate values (if matched) and injected into the scope. This allows for succinct handling of union
types in situations where match
is overkill.
Each branch of a match expression can also have a guard: an arbitrary conditional that must be met in order for it to match. Guards are written as where cond
and immediately follow the last pattern in an of
branch, preceding the colon.
interfaces
Interfaces can be thought of as analogous to Rust's traits, without explicit impl
blocks and without need for the derive
macro. Types that have functions fulfilling the interface requirements implicitly implement the associated interface.
The interface
type is composed of a list of function signatures that refer to the special type Self
that must exist for a type to be valid. The special type Self
is replaced with the concrete type at compile time in order to typecheck. They are declared with interface[signature, ...]
.
type Stack[T] = interface
push(self: mut Self, val: T)
pop(self: mut Self): T
peek(self: Self): T
func takes_any_stack(stack: Stack[int]) =
# only stack.push, stack.pop, and stack.peek are available methods
Differing from Rust, Haskell, and many others, there is no explicit impl
block. If there exist functions for a type that satisfy all of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this a more reasonable design. The purpose of explicit impl
blocks in ex. Rust is three-fold: to provide a limited form of uniform function call syntax; to explicitly group together associated code; and to disambiguate. UFCS provides for the first, the module system provides for the second, and the third is proposed to not matter.
Interfaces cannot be constructed because they are unsized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling an interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with indirection, however: type Foo = struct[a: int, b: ref interface[...]]
is perfectly valid.
Interfaces also cannot extend or rely upon other interfaces in any way. There is no concept of an interface extending an interface. There is no concept of a parameter satisfying two interfaces. In the author's experience, while such constructions are powerful, they are also an immense source of complexity, leading to less-than-useful interface hierarchies seen in languages like Java, and yes, Rust.
Instead, if one wishes to form an interface that also satisfies another interface, they must include all of the other interface's associated functions within the new interface. Given that interfaces overwhelmingly only have a handful of associated functions, and if you're using more than one interface you really should be using a concrete type, the hope is that this will provide explicitness.
Interfaces compose with modules to offer fine grained access control.
type aliases and distinct types
Any type can be declared as an alias to a type simply by assigning it to such. All functions defined on the original type carry over, and functions expecting one type may receive the other with no issues.
type Float = float
It is no more than an alias. When explicit conversion between types is desired and functions carrying over is undesired, distinct
types may be used.
type MyFloat = distinct float
let foo: MyFloat = MyFloat(192.68)
Types then must be explicitly converted via constructors.
Errata
default values
Puck does not have any concept of null
: all values must be initialized.
But always explicitly initializing types is syntactically verbose, and so most types have an associated "default value".
Default values:
bool
:false
int
,uint
, etc:0
float
, etc:0.0
char
:'\0'
str
:""
void
,never
: unconstructablearray[T]
,list[T]
:[]
set[T]
,table[T, U]
:{}
tuple[T, U, ...]
:(default values of its fields)
struct[T, U, ...]
:{default values of its fields}
enum[One, Two, ...]
:<first label>
union[T, U, ...]
: disallowedslice[T]
,func
: disallowedref
,ptr
: disallowed
For unions, slices, references, and pointers, this is a bit trickier. They all have no reasonable "default" for these types aside from null. Instead of giving in, the compiler instead disallows any non-initializations or other cases in which a default value would be inserted.
todo: consider user-defined defaults (ex. structs)
signatures and overloading
Puck supports overloading - that is, there may exist multiple functions, or multiple types, or multiple modules, so long as they have the same signature. The signature of a function / type / module is important. Interfaces, among other constructs, depend on the user having some understanding of what the compiler considers to be a signature. So, it is stated here explicitly:
- The signature of a function is its name and the types of each of its parameters, in order. Optional parameters are ignored. Generic parameters are ???
- ex. ...
- The signature of a type is its name and the number of generic parameters.
- ex. both
Result[T]
andResult[T, E]
are defined instd.results
- ex. both
- The signature of a module is just its name. This may change in the future.
subtyping
Mention of subtyping has been on occasion in contexts surrounding structural type systems, particularly the section on distinct types, but no explicit description of what the subtyping rules are have been given.
Subtyping is the implicit conversion of compatible types, usually in a one-way direction. The following types are implicitly convertible:
uint
==>int
int
==>float
uint
==>float
string
==>list[char]
(the opposite no, usepack
)array[T; n]
==>list[T]
struct[a: T, b: U, ...]
==>struct[a: T, b: U]
union[A: T, B: U]
==>union[A: T, B: U, ...]
inheritance
Puck is not an object-oriented language. Idiomatic design patterns in object-oriented languages are harder to accomplish and not idiomatic here.
But, Puck has a number of features that somewhat support the object-oriented paradigm, including:
- uniform function call syntax
- structural typing / subtyping
- interfaces
type Building = struct
size: struct[length, width: uint]
color: enum[Red, Blue, Green]
location: tuple[longitude, latitude: float]
type House = struct
size: struct[length, width: uint]
color: enum[Red, Blue, Green]
location: tuple[longitude, latitude: float]
occupant: str
func init(_: type[House]): House =
{ size: {length, width: 500}, color: Red
location: (0.0, 0.0), occupant: "Barry" }
func address(building: Building): str =
let number = int(building.location.0 / building.location.1).abs
let street = "Logan Lane"
return number.str & " " & street
# subtyping! methods!
print House.init().address()
func address(house: House): str =
let number = int(house.location.0 - house.location.1).abs
let street = "Logan Lane"
return number.str & " " & street
# overriding! (will warn)
print address(House.init())
# abstract types! inheritance!
type Addressable = interface for Building
func address(self: Self)
These features may compose into code that closely resembles its object-oriented counterpart. But make no mistake! Puck is static first and functional somewhere in there: dynamic dispatch and the like are not accessible (currently).