From 6017d62db7600af491592e4f0d78611f33dc6b5e Mon Sep 17 00:00:00 2001 From: JJ Date: Fri, 27 Oct 2023 00:51:37 -0700 Subject: minor updates --- README.md | 2 +- docs/MODULES.md | 6 ++-- docs/TYPES.md | 85 +++++++++++++++++++++++++++----------------------- src/ast.rs | 2 +- src/tree.rs | 7 ++++- std/default/format.pk | 3 +- std/default/options.pk | 26 +++++++-------- std/default/results.pk | 38 +++++++++++----------- 8 files changed, 90 insertions(+), 79 deletions(-) diff --git a/README.md b/README.md index d026a6a..e6b5cb1 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ A place where I can make some bad decisions. Puck is an experimental, memory safe, structurally typed, imperative and functional programming language. -It aims to be clean and succinct while performant: inspired by the metaprogramming of [Nim](https://nim-lang.org/), the error handling of [Swift](https://www.swift.org/), and the performance/safety guarantees of [Rust](https://www.rust-lang.org/). +It aims to be clean and succinct while performant: inspired by the syntax and metaprogramming of [Nim](https://nim-lang.org/), the error handling of [Swift](https://www.swift.org/), the performance and safety guarantees of [Rust](https://www.rust-lang.org/), and the module system of [OCaml](https://ocaml.org/). ```nim import std/tables diff --git a/docs/MODULES.md b/docs/MODULES.md index 911d207..4f5bb70 100644 --- a/docs/MODULES.md +++ b/docs/MODULES.md @@ -1,8 +1,8 @@ # Modules and Namespacing -Puck has a rich module system, inspired by such expressive systems in the ML family of languages, notably including Rust and OCaml. Unlike these systems, however, opening modules i.e. unqualified imports is *encouraged* - even in the global scope - which at first would appear to run contrary to the point of a module system. Puck cleans up such "namespace pollution" by **type-based disambiguation**. +Puck has a rich module system, inspired by such expressive systems in the ML family of languages, notably including Rust and OCaml. Unlike these systems, however, opening modules i.e. unqualified imports is *encouraged* - even in the global scope - which at first would appear to run contrary to the point of a module system. Such "namespace pollution" is made a non-issue by **type-based disambiguation** (and can be avoided regardless with qualified imports anyway). -A major goal of Puck's module system is to allow the same level of expressiveness as the ML family - while cutting down on the extraneous syntax and boilerplate needed to do so. As such, modularity features are written directly inline with their declaration, and the file system structure is reused to form an implicit module system for internal use. +A major goal of Puck's module system is to allow the same level of expressiveness as the ML family - while cutting down on the extraneous syntax and boilerplate needed to do so. As such, access modifiers are written directly inline with their declaration, and the file system structure is reused to form an implicit module system for internal use. ```puck import std/[ascii, unicode] @@ -22,7 +22,7 @@ These unqualified imports by default may seem like madness to a Python or C prog ```puck ``` -Multiple modules can be imported in the same scope, of course, and so conflicts may arise on imports. In the pursuit of *some* explicitness, no attempt is made to guess the proper identifier from usage. These must be disambiguated by prefixing the module name, followed by a dot. This disambiguation breaks uniform function call syntax on functions: yet because functions only conflict when both their name and entire function signature overlap, this is a rare occurrence. If so desired, an import followed by `/[]` will force full qualification of all identifiers in the module - yet this is an antipattern and not recommended. +Multiple modules can be imported in the same scope, and so conflicts may arise on imports. In the pursuit of *some* explicitness, no attempt is made to guess the proper identifier from usage. These must be disambiguated by prefixing the module name, followed by a dot. This disambiguation breaks uniform function call syntax on functions: yet because functions only conflict when both their name and entire function signature overlap, this is a rare occurrence. If so desired, an import followed by `/[]` will force full qualification of all identifiers in the module - yet this is an antipattern and not recommended. Unrelated to the module system - but note that functions only differing in return type are allowed, albeit discouraged. Extra type annotations may be needed for the compiler to properly infer in such cases. diff --git a/docs/TYPES.md b/docs/TYPES.md index b16c508..e674739 100644 --- a/docs/TYPES.md +++ b/docs/TYPES.md @@ -8,7 +8,7 @@ Basic types can be one-of: - `bool`: internally an enum. - `int`: integer number. x bits of precision by default. - `uint`: same as `int`, but unsigned for more precision. - - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size + - `i8`, `i16`, `i32`, `i64`, `i128`: specified integer size - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size - overflows into bigints for safety and ease of cryptographical code. - `float`: floating-point number. @@ -33,8 +33,14 @@ todo ### strings -todo +Strings are: +- mutable +- internally a byte (uint8) array +- externally a char (uint32) array +- prefixed with their length and capacity +- automatically resize like a list +They are also quite complicated. Puck has full support for Unicode and wishes to be intuitive, performant, and safe, as all languages wish to be. Strings present a problem that much effort has been spent on in Swift and Rust (primarily) to solve. ## Container types Container types, broadly speaking, are types that contain other types. These exclude the types in [advanced types](#advanced-types). @@ -78,12 +84,12 @@ No variables may be assigned these types, nor may any function return them. These are monomorphized into more specific functions at compile-time if needed. Parameter types can be one-of: -- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. +- mutable: `func foo(a: mut str)`: Denotes the mutability of a parameter. Parameters are immutable by default. - Passed as a `ref` if not one already, and marked mutable. -- static: `func foo(a: static str)`: Denotes a parameter that's value must be known at compile-time. Useful with `when` for writing generic code. +- static: `func foo(a: static str)`: Denotes a parameter whose value must be known at compile-time. Useful with `when` for writing generic code. - generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. - constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. - - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). + - functions: `func foo(a: func (x, y: int): int)`: First-class functions. - Functions may be prefixed with modifiers: one-of `pure`, `yeet`, `async`. - Syntactic sugar is available: `func foo(a: (int, int) -> int)`. This is not usable with modifiers. @@ -95,7 +101,7 @@ Parameter types can be one-of: These parameter types (except `static`) share a common trait: they are not *sized*. The exact type is not generally known until compilation - and in the case of interfaces, sometimes not even during compilation! As the size is not always rigorously known, problems arise when attempting to construct parameter types or compose them with other types: and so this is disallowed. Not all is lost, however, as they may still be used with *indirection* - detailed in the [section on reference types](#reference-types). -### Generic types +### generic types Functions can take a _generic_ type, that is, be defined for a number of types at once: @@ -112,7 +118,7 @@ func length[T: str | list](a: T) = return a.len ``` -The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). +The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may then refer to the generic types). Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. @@ -123,7 +129,7 @@ Types are typically constructed by value on the stack. That is, without any leve Reference types can be one-of: - `ref T`: An automatically-managed reference to type `T`. This is a pointer of size `uint` (native). -- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. +- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. The compiler will yell at you. ```puck type BinaryTree = ref struct @@ -170,6 +176,8 @@ Tuples are an *ordered* collection of either named or unnamed types. They are declared with `tuple[Type, identifier: Type, ...]` and initialized with parentheses: `(413, "hello", value: 40000)`. +They are exclusively ordered - named types within tuples are just syntax sugar for positional access. Passing a fully unnamed tuple into a context that expects a tuple with a named parameter is allowed so long as the types line up in order. + ```puck ``` @@ -248,42 +256,34 @@ import std/tables func eval(expr: Expr, context: var HashTable[Ident, Value]): Result[Value] match expr - of Literal(value): - value - of Variable(ident): - context.get(ident) - of Application{body, arg}: - if body of Abstraction{param, body as inner_body}: - context.set(param, eval(arg)) - inner_body.eval(context) - else: - Error(InvalidExpr) - of Conditional{cond, then_case, else_case}: - if eval(cond, context): - then_case.eval(context) - else: - else_case.eval(context) - of _: + of Literal(value): value + of Variable(ident): context.get(ident) + of Application{body, arg}: + if body of Abstraction{param, body as inner_body}: + context.set(param, eval(arg)) + inner_body.eval(context) + else: Error(InvalidExpr) + of Conditional{cond, then_case, else_case}: + if eval(cond, context): + then_case.eval(context) + else: + else_case.eval(context) + of _: + Error(InvalidExpr) ``` -The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity; but the `variable of Type(x)` syntax can be reused as a conditional, in `if` statements and elsewhere. - -The `of` operator is similar to the `is` operator in that it queries type equality. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. +The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity; but the `variable of Type(binding)` syntax can be reused as a conditional, in `if` statements and elsewhere. Each branch of a match expression can have a *guard*: an arbitrary conditional that must be met in order for it to match. Guards are written as `where cond` and immediately follow the last pattern in an `of` branch, preceding the colon. -When matching unions with an inner product type (structs or tuples), external extraneous parenthesis are elided. +The `of` *operator* is similar to the `is` operator in that it queries type equality, returning a boolean. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. This allows for succinct handling of `union` types in situations where `match` is overkill. -todo: guards, either `where` or `if` +`match` expressions and `of` operators have some special rules. When matching unions with an inner product type (structs or tuples), external extraneous parenthesis are elided. todo: others? ### interfaces -Interfaces can be thought of as analogous to Rust's traits, without explicit `impl` blocks and without need for the `derive` macro. +Interfaces can be thought of as analogous to Rust's traits, without explicit `impl` blocks and without need for the `derive` macro. Types that have functions fulfilling the interface requirements implicitly implement the associated interface. -The `interface` type is composed of a list of function signatures that refer to the special type `Self` that must exist for a type to be valid. -Interfaces cannot be constructed and so are only of use as parameter types and the like, and so are always used in a *type conversion*. -The special type `Self` is replaced with the type being converted at compile time in order to typecheck. - -They are declared with `interface[signature, ...]`. +The `interface` type is composed of a list of function signatures that refer to the special type `Self` that must exist for a type to be valid. The special type `Self` is replaced with the concrete type at compile time in order to typecheck. They are declared with `interface[signature, ...]`. ```puck type Stack[T] = interface @@ -295,13 +295,20 @@ func takes_any_stack(stack: Stack[int]) = # only stack.push, stack.pop, and stack.peek are available methods ``` -Differing from Rust, Haskell, and others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this fine and idiomatic. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. +Differing from Rust, Haskell, and many others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this reasonable. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to explicitly group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. + +Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling an interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. -Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling the interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. +Interfaces compose with [modules](MODULES.md) to offer fine grained access control. -While functions are the primary way of performing operations on types, they are not the only way, and listing all explicitly can be painful - instead, it can be desired to be able to *associate a type* and any field access or existing functions on that type with the interface. todo: i have not decided on the syntax for this yet. +todo: I have not decided whether the names of parameters is / should be relevant, or enforcable, or present. I'm leaning towards them not being present. But if they are enforcable, it makes it harder to implicitly implement the wrong interface. Design notes to consider: https://blog.rust-lang.org/2015/05/11/traits.html -Interfaces also compose with [the module system](MODULES.md) to offer fine grained crafting of types. +### generic types + +Types, like functions, can be *generic*: defined for an unknown arbitrary type, monomorphized at compile time. Indeed, we have already seen them before: in the above interface example. The syntax much follows the syntax for generic functions. + +```puck +``` ### distinct types diff --git a/src/ast.rs b/src/ast.rs index 71daa9b..e55b473 100644 --- a/src/ast.rs +++ b/src/ast.rs @@ -63,7 +63,7 @@ pub enum Binding { kind: Option, value: Box }, - Func { + FuncDecl { public: bool, effect: Option, id: Id, diff --git a/src/tree.rs b/src/tree.rs index 02dc749..8e2e1d0 100644 --- a/src/tree.rs +++ b/src/tree.rs @@ -1,4 +1,4 @@ -/// A simple append-only tree data structure. Represented as a Vec. +/// A simple flat append-only tree data structure. Represented as a Vec. pub struct Tree(Vec>); /// The associated Node to the Tree. @@ -12,11 +12,16 @@ pub struct Node { data: T } +/// Nodes themselves are not held onto and used, instead we use a NodeId, +/// only valid in the context of a particular Tree. There is not a great +/// way to enforce this validity that I know of, unfortunately. #[derive(Copy, Clone)] pub struct NodeId(usize); impl Tree { /// Create a new Tree with a root element. + /// You should only make one Tree in any given context. + /// Otherwise there are safety risks with using the wrong NodeId for a Tree. pub fn new(data: T) -> Tree { Tree(vec![Node { parent: None, previous_sibling: None, diff --git a/std/default/format.pk b/std/default/format.pk index c7d7ab8..ec573ef 100644 --- a/std/default/format.pk +++ b/std/default/format.pk @@ -13,10 +13,11 @@ pub type Debug = interface dbg(Self): str ## Prints all of its arguments to the command line. -pub func print(params: varargs[Display]) = +pub proc print(params: varargs[Display]) = stdout.write(params.map(x => x.str).join(" "), "\n") ## Prints all of its arguments to the command line, in Debug form. +## Note: this function is special! does not count as a side effect pub func dbg(params: varargs[Debug]) = stdout.write(params.map(x => x.dbg).join(" "), "\n") diff --git a/std/default/options.pk b/std/default/options.pk index 3aaea49..bbe0d9c 100644 --- a/std/default/options.pk +++ b/std/default/options.pk @@ -12,30 +12,30 @@ pub func is_some[T](self: Option[T]): bool = pub func is_none[T](self: Option[T]): bool = not self.is_some -## Converts an Option[T] to a Result[T, E] given a user-provided error. +## Converts an `Option[T]` to a `Result[T, E]` given a user-provided error. pub func err[T, E](self: Option[T], error: E): Result[T, E] = if self of Some(x): Okay(x) else: Error(error) -## Applies a function to T, if it exists. -pub func map[T, U](self: Option[T], proc: T -> U): Option[U] = +## Applies a function to `T`, if it exists. +pub func map[T, U](self: Option[T], fn: T -> U): Option[U] = if self of Some(x): - Some(x.proc) + Some(fn(x)) else: None -## Converts T to a None, if proc returns false and it exists. -pub func filter[T](self: Option[T], proc: T -> bool): Option[T] = - if self of Some(x) and proc(x): +## Converts `T` to a `None`, if `fn` returns false and it exists. +pub func filter[T](self: Option[T], fn: T -> bool): Option[T] = + if self of Some(x) and fn(x): Some(x) else: None -## Applies a function to T, if it exists. Equivalent to .map(func).flatten. -pub func flatmap[T, U](self: Option[T], proc: T -> Option[U]): Option[U] = +## Applies a function to T, if it exists. Equivalent to `self.map(fn).flatten`. +pub func flatmap[T, U](self: Option[T], fn: T -> Option[U]): Option[U] = if self of Some(x): - x.proc + fn(x) else: None ## Converts from Option[Option[T]] to Option[T]. @@ -55,14 +55,14 @@ pub yeet func `!`[T](self: Option[T]): T = if self of Some(x): x else: raise Exception # todo: syntax?? -## Indirect access. Propagates None. +## Indirect access. Propagates `None`. pub macro `?`[T](self: Option[T]) = quote: match self of Some(x): x of None: return None -## Overloads the == operation for use on Options. +## Overloads the `==` operation for use on Options. pub func `==`[T](a, b: Option[T]): bool = match (a, b) of (Some(x), Some(y)): @@ -70,7 +70,7 @@ pub func `==`[T](a, b: Option[T]): bool = of _: false -## Overloads the str() function for use on Options. +## Overloads the `str()` function for use on Options. pub func str[T](self: Option[T]): str = if self of Some(x): fmt("some({})", x.str) diff --git a/std/default/results.pk b/std/default/results.pk index 187ece9..4e4d27a 100644 --- a/std/default/results.pk +++ b/std/default/results.pk @@ -7,8 +7,6 @@ pub type Result[T, E] = union Okay: T Error: E -# todo: determine the difference between interfaces and types -# ErrorInterface? Errorable? Err? pub type Error = interface str(Self): str dbg(Self): str @@ -20,42 +18,42 @@ pub func is_ok[T, E](self: Result[T, E]): bool = pub func is_err[T, E](self: Result[T, E]): bool = not self.is_ok -## Converts from a Result[T, E] to an Option[T]. +## Converts from a `Result[T, E]` to an `Option[T]`. pub func ok[T, E](self: Result[T, E]): Option[T] = if self of Okay(x): Some(x) else: None() -## Converts from a Result[T, E] to an Option[E]. +## Converts from a `Result[T, E]` to an `Option[E]`. pub func err[T, E](self: Result[T, E]): Option[E] = if self of Error(x): Some(x) else: None() -## Applies a function to T, if self is Okay. -pub func map[T, E, U](self: Result[T, E], proc: T -> U): Result[U, E] = +## Applies a function to `T`, if self is `Okay`. +pub func map[T, E, U](self: Result[T, E], fn: T -> U): Result[U, E] = match self of Okay(x): - Okay(x.proc) + Okay(fn(x)) of Error(e): Error(e) -## Applies a function to E, if self is Error. -pub func map_err[T, E, F](self: Result[T, E], proc: E -> F): Result[T, F] = +## Applies a function to `E`, if self is `Error`. +pub func map_err[T, E, F](self: Result[T, E], fn: E -> F): Result[T, F] = match self of Error(e): - Error(e.proc) + Error(e.fn) of Okay(x): Okay(x) -## Applies a function to T, if it exists. Equivalent to .map(func).flatten. -pub func flatmap[T, E, U](self: Result[T, E], proc: T -> Result[U, E]): Result[U, E] = +## Applies a function to `T`, if it exists. Equivalent to `self.map(fn).flatten`. +pub func flatmap[T, E, U](self: Result[T, E], fn: T -> Result[U, E]): Result[U, E] = match self of Okay(x): - x.proc + fn(x) of Error(e): Error(e) -## Converts from a Result[Result[T, E], E] to a Result[T, E]. +## Converts from a `Result[Result[T, E], E]` to a `Result[T, E]`. pub func flatten[T, E](self: Result[Result[T, E], E]): Result[T, E] = match self of Okay(Okay(x)): @@ -63,14 +61,14 @@ pub func flatten[T, E](self: Result[Result[T, E], E]): Result[T, E] = of Okay(Error(e)), Error(e): Error(e) -## Transposes a Result[Option[T], E] to an Option[Result[T, E]]. +## Transposes a `Result[Option[T], E]` to an `Option[Result[T, E]]`. pub func transpose[T, E](self: Result[Option[T], E]): Option[Result[T, E]] = match self of Okay(Some(x)): Some(Okay(x)) of Okay(None()), Error(_): None() -## Transposes an Option[Result[T, E]] to a Result[Option[T], E]. Takes a default error. +## Transposes an `Option[Result[T, E]]` to a `Result[Option[T], E]`. Takes a default error. pub func transpose[T, E](self: Option[Result[T, E]], error: E): Result[Option[T], E] = match self of Some(Okay(x)): @@ -84,7 +82,7 @@ pub func transpose[T, E](self: Option[Result[T, E]], error: E): Result[Option[T] pub func get_or[T, E](self: Result[T, E], default: T): T = if self of Okay(x): x else: default -## Directly accesses the inner value. Throws an exception if Error(e). +## Directly accesses the inner value. Throws an exception if `Error`. pub yeet func `!`[T, E](self: Result[T, E]): T = match self of Okay(x): x @@ -94,14 +92,14 @@ pub yeet func get_err[T, E](self: Result[T, E]): E = of Error(e): e of Okay(x): raise Exception(x) # todo: syntax?? -## Indirect access. Propagates Error. +## Indirect access. Propagates `Error`. macro `?`[T, E](self: Result[T, E]) = quote: match self of Okay(x): x of Error(e): return Error(e) -## Overloads the == operation for use on Results. +## Overloads the `==` operation for use on Results. pub func `==`[T, E, F](a: Result[T, E], b: Result[T, F]): bool = match (a, b) of (Okay(x), Okay(y)): @@ -109,7 +107,7 @@ pub func `==`[T, E, F](a: Result[T, E], b: Result[T, F]): bool = of _: false -## Overloads the str() function for use on Results. +## Overloads the `str()` function for use on Results. pub func str[T, E](self: Result[T, E]): str = match self of Some(x): -- cgit v1.2.3-70-g09d2