An Overview of Puck

Puck is an experimental, high-level, memory-safe, statically-typed, whitespace-sensitive, interface-oriented, imperative programming language with functional underpinnings.

It attempts to explore designs in making functional programming paradigms comfortable to those familiar with imperative and object-oriented languages, as well as deal with some more technical problems along the way, such as integrated refinement types and typesafe interop.

This is the language I keep in my head. It reflects the way I think and reason about code.

I do hope others enjoy it.

Variables and Comments

let ident: int = 413
# type annotations are optional
var phrase = "Hello, world!"
const compile_time = std.os.file_name

Variables may be mutable (var), immutable (let), or compile-time evaluated and immutable (const). Type annotations on variables and other bindings follow the name of the binding (with : Type), and are typically optional. Variables are conventionally written in snake_case. Types are conventionally written in PascalCase. The type system is comprehensive, and complex enough to warrant delaying full coverage of until the end. Some basic types are of note, however:

  • int, uint: signed and unsigned integers
    • i[\d]+, u[\d]+: arbitrary fixed-size counterparts
  • float, decimal: floating-point numbers
    • f32/f64/f128: their fixed-size counterparts
    • dec64/dec128: their fixed-size counterparts
  • byte: an alias to u8, representing one byte
  • char: an alias to u32, representing one Unicode character
  • bool: defined as union[false, true]
  • array[T, size]: primitive fixed-size arrays
  • list[T]: dynamic lists
  • str: mutable strings. internally a list[byte], externally a list[char]
  • slice[T]: borrowed "views" into the three types above

Comments are declared with # and run until the end of the line. Documentation comments are declared with ## and may be parsed by language servers and other tooling. Multi-line comments are declared with #[ ]# and may be nested. Taking cues from the Lisp family of languages, any expression may be commented out with a preceding #;.

Functions and Indentation

Functions are declared with the func keyword. They take an (optional) list of generic parameters (in brackets), an (optional) list of parameters (in parentheses), and must be annotated with a return type if they return a type. Every function parameter must be annotated with a type. Their type may optionally be prefixed with either lent, mut or const: denoting an immutable or mutable borrow (more on these later), or a constant type (known to the compiler at compile time, and usable in const exprs). Generic parameters may each be optionally annotated with a type functioning as a constraint.

Whitespace is significant but flexible: functions may be declared entirely on one line if so desired. A new level of indentation after certain tokens (=, do, then) denotes a new level of scope. There are some places where arbitrary indentation and line breaks are allowed - as a general rule of thumb, after operators, commas, and opening parentheses. The particular rules governing indentation may be found in the syntax guide.

Uniform Function Call Syntax

func inc(self: list[int], by: int): list[int] =
  self.map(x => x + by)

print inc([1, 2, 3], len("four")) # 5, 6, 7
print [1, 2, 3].inc(1)  # 2, 3, 4
print [1].len # 1

Puck supports uniform function call syntax: and so any function may be called using the typical syntax for method calls, that is, the first parameter of any function may be appended with a . and moved to precede it, in the style of a typical method. (There are no methods in Puck. All functions are statically dispatched. This may change in the future.)

This allows for a number of syntactic cleanups. Arbitrary functions with compatible types may be chained with no need for a special pipe operator. Object field access, module member access, and function calls are unified, reducing the need for getters and setters. Given a first type, IDEs using dot-autocomplete can fill in all the functions defined for that type. Programmers from object-oriented languages may find the lack of object-oriented classes more bearable. UFCS is implemented in shockingly few languages, and so Puck joins the tiny club that previously consisted of just D, Nim, Koka, and Effekt.

Basic Types

Boolean logic and integer operations are standard and as one would expect out of a typed language: and, or, xor, not, shl, shr, +, -, *, /, <, >, <=, >=, div, mod, rem. Notably:

  • the words and/or/not/shl/shr are used instead of the symbolic &&/||/!/<</>>
  • integer division is expressed with the keyword div while floating point division uses /
  • % is absent and replaced with distinct modulus and remainder operators
  • boolean operators are bitwise and also apply to integers and floats
  • more operators are available via the standard library (std.math.exp and std.math.log)

The above operations are performed with operators, special functions that take a prefixed first argument and (often) a suffixed second argument. Custom operators may be implemented, but they must consist of only a combination of the symbols = + - * / < > @ $ ~ & % | ! ? ^ \ for the purpose of keeping the grammar context-free. They are are declared identically to functions.

Term (in)equality is expressed with the == and != operators. Type equality is expressed with is. Subtyping relations may be queried with of, which has the additional property of introducing new bindings to the current scope in certain contexts (more on this in the types document).

let phrase: str = "I am a string! Wheeee! ✨"
for c in phrase do
  stdout.write(c) # I am a string! Wheeee! ✨
for b in phrase.bytes() do
  stdout.write(b.char) # Error: cannot convert from byte to char
print phrase.last() # ✨

String concatenation uses a distinct & operator rather than overloading the + operator (as the complement - has no natural meaning for strings). Strings are unified, mutable, internally a byte array, externally a char array, and are stored as a pointer to heap data after their length and capacity (fat pointer). Chars are four bytes and represent a Unicode character in UTF-8 encoding. Slices of strings are stored as a length followed by a pointer to string data, and have non-trivial interactions with the memory management system. More details can be found in the type system overview.

Conditionals and Pattern Matching

Basic conditional control flow uses standard if/elif/else statements. The when statement provides a compile-time if. It also takes elif and else branches and is syntactic sugar for an if statement within a const expression (more on those later).

All values in Puck must be handled, or explicitly discarded. This allows for conditional statements and many other control flow constructs to function as expressions: and evaluate to a value when an unbound value is left at the end of each of their branches' scopes. This is particularly relevant for functions, where it is often idiomatic to omit an explicit return statement. There is no attempt made to differentiate without context, and so expressions and statements often look identical in syntax.

Exhaustive structural pattern matching is available with the match/of statement, and is particularly useful for the struct and union types. of branches of a match statement take a pattern, of which the unbound identifiers within will be injected into the branch's scope. Multiple patterns may be used for one branch provided they all bind the same identifiers of the same type. Branches may be guarded with the where keyword, which takes a conditional, and will necessarily remove the branch from exhaustivity checks.

The of statement also stands on its own as an operator for querying subtype equality. Used as a conditional in if statements or while loops, it retains the variable injection properties of its match counterpart. This allows it to be used as a compact and coherent alternative to if let statements in other languages.

Error Handling

type Result[T] = Result[T, ref Err]
func may_fail: Result[T] = ...

Error handling is done via a fusion of functional monadic types and imperative exceptions, with much syntactic sugar. Functions may raise exceptions, but by convention should return Option[T] or Result[T, E] types instead: these may be handled in match or if/of statements. The effect system built into the compiler will track functions that raise errors, and warn on those that are not handled explicitly via try/with statements anywhere on the call stack.

A bevy of helper functions and macros are available for Option/Result types, and are documented and available in the std.options and std.results modules (included in the prelude by default). Two in particular are of note: the ? macro accesses the inner value of a Result[T, E] or propagates (returns in context) the Error(e), and the ! accesses the inner value of an Option[T] / Result[T, E] or raises an error on None / the specific Error(e). Both operators take one parameter and so are postfix. The ? and ! macros are overloaded and additionally function on types as shorthand for Option[T] and Result[T] respectively.

The utility of the ? macro is readily apparent to anyone who has written code in Rust or Swift. The utility of the ! function is perhaps less so obvious. These errors raised by !, however, are known to the compiler: and they may be comprehensively caught by a single or sequence of with statements. This allows for users used to a try/with error handling style to do so with ease, with only the need to add one additional character to a function call.

More details may be found in error handling overview.

Blocks and Loops

loop
  print "This will never normally exit."
  break

for i in 0 .. 3 do # exclusive
  for j in 0 ..= 3 do # inclusive
    print "{} {}".fmt(i, j)

Three types of loops are available: for loops, while loops, and infinite loops (loop loops). For loops take a binding (which may be structural, see pattern matching) and an iterable object and will loop until the iterable object is spent. While loops take a condition that is executed upon the beginning of each iteration to determine whether to keep looping. Infinite loops are infinite are infinite are infinite are infinite are infinite are infinite and must be manually broken out of.

There is no special concept of iterators: iterable objects are any object that implements the Iter[T] class (more on those in the type system document): that is, provides a self.next() function returning an Option[T]. As such, iterators are first-class constructs. For loops can be thought of as while loops that unwrap the result of the next() function and end iteration upon a None value. While loops, in turn, can be thought of as infinite loops with an explicit conditional break.

The break keyword immediately breaks out of the current loop, and the continue keyword immediately jumps to the next iteration of the current loop. Loops may be used in conjunction with blocks for more fine-grained control flow manipulation.

block
  statement

let x = block
  let y = read_input()
  transform_input(y)

block foo
  for i in 0 ..= 100 do
    block bar
      if i == 10 then break foo
      print i

Blocks provide arbitrary scope manipulation. They may be labelled or unlabelled. The break keyword additionally functions inside of blocks and without any parameters will jump out of the current enclosing block (or loop). It may also take a block label as a parameter for fine-grained scope control.

Module System

Code is segmented into modules. Modules may be made explicit with the mod keyword followed by a name, but there is also an implicit module structure in every codebase that follows the structure and naming of the local filesystem. For compatibility with filesystems, and for consistency, module names are exclusively lowercase (following the same rules as Windows).

A module can be imported into another module by use of the use keyword, taking a path to a module or modules. Contrary to the majority of languages ex. Python, unqualified imports are encouraged - in fact, are idiomatic (and the default) - type-based disambiguation and official LSP support are intended to remove any ambiguity.

Within a module, functions, types, constants, and other modules may be exported for use by other modules with the pub keyword. All such identifiers are private by default and only accessible module-locally without. Modules are first-class and may be bound, inspected, modified, and returned. As such, imported modules may be re-exported for use by other modules by binding them to a public constant.

More details may be found in the modules document.

Compile-time Programming

## Arbitrary code may execute at compile-time.
const compile_time =
  match std.os.platform # known at compile-time
    of Windows then "windows"
    of MacOS then "darwin"
    of Linux then "linux"
    of Wasi then "wasm"
    of _ then "unknown platform"

## The propagation operator is a macro so that `return` is injected into the function scope.
pub macro ?[T](self: Option[T]) =
  quote
    match `self`
    of Some(x) then x
    of None then return None

## Type annotations and overloading allow us to define syntactic sugar for `Option[T]`, too.
pub macro ?(T: type) =
  quote Option[`T`]

Compile-time programming may be done via the previously-mentioned const keyword and when statements: or via macros. Macros operate directly on the abstract syntax tree at compile-time: taking in syntax objects, transforming them, and returning them to be injected. They are hygenic and will not capture identifiers not passed as parameters. While parameters are syntax objects, they can be annotated with types to constrain applications of macros and allow for overloading. Macros are written in ordinary Puck: there is thus no need to learn a separate "macro language", as syntax objects are just standard unions. Additionally, support for quoting removes much of the need to operate on raw syntax objects. A full description may be found in the metaprogramming document.

Async System and Threading

The async system is colourblind: the special async macro will turn any function call returning a T into an asynchronous call returning a Future[T]. The special await function will wait for any Future[T] and return a T (or an error). Async support is included in the standard library in std.async in order to allow for competing implementations. More details may be found in the async document.

Threading support is complex and also regulated to external libraries. OS-provided primitives will likely provide a spawn function, and there will be substantial restrictions for memory safety. I really haven't given much thought to this.

Memory Management

# Differences in Puck and Rust types in declarations and at call sights.
# note: this notation is not valid and is for illustrative purposes only
func foo(a:
  lent T → &'a T
  mut T → &'a mut T
  T → T
):
  lent T → &'a T
  mut T → &'a mut T
  T → T

let t: T = ...
foo( # this is usually elided
  lent t → &t
  mut t → &mut t
  t → t
)

Puck copies Rust-style ownership near verbatim. &T corresponds to lent T, &mut T to mut T, and T to T: with T implicitly convertible to lent T and mut T at call sites. A major goal of Puck is for all lifetimes to be inferred: there is no overt support for lifetime annotations, and it is likely code with strange lifetimes will be rejected before it can be inferred. (Total inference, however, is a goal.)

Another major difference is the consolidation of Box, Rc, Arc, Cell, RefCell into just two (magic) types: ref and refc. ref takes the role of Box, and refc both the role of Rc and Arc: while Cell and RefCell are disregarded. The underlying motivation for compiler-izing these types is to make deeper compiler optimizations accessible: particularly with refc, where the existing ownership framework is used to eliminate unnecessary counts. Details on memory safety, references and pointers, and deep optimizations may be found in the memory management overview.

Types System

# The type Foo is defined here as an alias to a list of bytes.
type Foo = list[byte]

# implicit conversion to Foo in declarations
let foo: Foo = [1, 2, 3]

func fancy_dbg(self: Foo) =
  print "Foo:"
  # iteration is defined for list[byte]
  # so it implicitly carries over: and is defined on Foo
  for elem in self do
    dbg(elem)

# NO implicit conversion to Foo on calls
[4, 5, 6].foo_dbg # this fails!

Foo([4, 5, 6]).foo_dbg # prints: Foo: 4 5 6

Finally, a few notes on the type system are in order. Types are declared with the type keyword and are aliases: all functions defined on a type carry over to its alias, though the opposite is not true. Functions defined on the alias must take an object known to be a type of that alias: exceptions are made for type declarations, but at call sites this means that conversion must be explicit.

# We do not want functions defined on list[byte] to carry over,
# as strings function differently (operating on chars).
# So we declare `str` as a struct, rather than a type alias.
pub type str = struct
  data: list[byte]

# However, the underlying `empty` function is still useful.
# So we expose it in a one-liner alias.
# In the future, a `with` macro may be available to ease carryover.
pub func empty(self: str): bool = self.data.empty

# Alternatively, if we want total transparent type aliasing, we can use constants.
pub const MyAlias: type = VeryLongExampleType

If one wishes to define a new type without previous methods accessible, the newtype paradigm is preferred: declaring a single-field struct, and manually implementing functions that carry over. It can also be useful to have transparent type aliases, that is, simply a shorter name to refer to an existing type. These do not require type conversion, implicit or explicit, and can be used freely and interchangeably with their alias. This is done with constants.

Types, like functions, can be generic: declared with "holes" that may be filled in with other types upon usage. A type must have all its holes filled before it can be constructed. The syntax for generics in types much resembles the syntax for generics in functions, and generic constraints and the like also apply.

Structs and Tuples

# standard alternative syntax to inline declarations
type MyStruct = struct
  a: str
  b: str

# syntactic sugar for tuple[str, b: str]
type MyTuple = (str, b: str)

let a: MyTuple = ("hello", "world")
print a.1 # world
print a.b # world

let c: MyStruct = {a = a.0, b = a.1}
print c.b # world

Struct and tuple types are declared with struct[<fields>] and tuple[<fields>], respectively. Their declarations make them look similar at a glance: but they differ fairly fundamentally. Structs are unordered and every field must be named. They may be constructed with brackets. Tuples are ordered and so field names are optional - names are just syntactic sugar for positional access. Tuples are both constructed and optionally declared with parentheses.

It is worth noting that there is no concept of pub at a field level on structs - a type is either fully transparent, or fully opaque. This is because such partial transparency breaks with structural initialization (how could one provide for hidden fields?). The @[opaque] attribute allows for expressing that the internal fields of a struct are not to be accessed or initialized: this, however, is only a compiler warning and can be totally suppressed with @[allow(opaque)].

Unions and Enums

type Expr = union
  Literal(int)
  Variable(str)
  Abstraction(param: str, body: ref Expr)
  Application(body: ref Expr, arg: ref Expr)

Union types are composed of a list of variants. Each variant has a tag and an inner type the union wraps over. Before the inner type can be accessed, the tag must be pattern matched upon, in order to handle all possible values. These are also known as sum types or tagged unions in other languages.

Union types are the bread and butter of structural pattern matching. Composed with structs and tuples, unions provide for a very general programming construct commonly referred to as an algebraic data type. This is often useful as an idiomatic and safer replacement for inheritance.

type Opcode = enum
  BRK INC POP NIP SWP ROT DUP OVR EQU NEQ GTH LTH JMP JCN JSR STH JCI JMI
  LDZ STZ LDR STR LDA STA DEI DEO ADD SUB MUL DIV AND ORA EOR SFT JSI LIT

print Opcode.BRK # 0
...

Enum types are similarly composed of a list of variants. These variants, however, are static values: assigned at compile-time, and represented under the hood by a single integer. They function similarly to unions, and can be passed through to functions and pattern matched upon, however their underlying simplicity and default values mean they are much more useful for collecting constants and acting as flags than anything else.

Classes

pub type Iter[T] = class
  next(mut Self): T?

pub type Peek[T] = class
  next(mut Self): T?
  peek(mut Self): T?
  peek_nth(mut Self, int): T?

Class types function much as type classes in Haskell or traits in Rust do. They are not concrete types, and cannot be constructed - instead, their utility is via indirection, as parameters in functions or as ref types in structures, providing constraints that some concrete type must meet. They consist of a list of function signatures, implementations of which must exist for the given type passed in in order to compile.

Their major difference, however, is that Puck's classes are implicit: there is no impl block that implementations of their associated functions have to go under. If functions for a concrete type exist satisfying some class, the type implements that class. This does run the risk of accidentally implementing a class one does not desire to, but the author believes such situations are few and far between and well worth the decreased syntactic and semantic complexity. As a result, however, classes are entirely unable to guarantee any invariants hold (like PartialOrd or Ord in Rust do).

As the compiler makes no such distinction between fields and single-argument functions on a type when determining identifier conflicts, classes similarly make no such distinction. Structs may be described with their fields written as methods. They do distinguish borrowed/mutable/owned parameters, those being part of the type signature.

Classes are widely used throughout the standard library to provide general implementations of such conveniences like iteration, debug and display printing, generic error handling, and much more.