diff options
author | JJ | 2023-10-27 07:51:37 +0000 |
---|---|---|
committer | JJ | 2023-10-27 07:51:37 +0000 |
commit | 6017d62db7600af491592e4f0d78611f33dc6b5e (patch) | |
tree | c2f42e4264692f92b0eded551617bf8bd8b1c948 /docs | |
parent | 2ea4fd4c09ad71c4ac648cf3645426476dd7521f (diff) |
minor updates
Diffstat (limited to 'docs')
-rw-r--r-- | docs/MODULES.md | 6 | ||||
-rw-r--r-- | docs/TYPES.md | 85 |
2 files changed, 49 insertions, 42 deletions
diff --git a/docs/MODULES.md b/docs/MODULES.md index 911d207..4f5bb70 100644 --- a/docs/MODULES.md +++ b/docs/MODULES.md @@ -1,8 +1,8 @@ # Modules and Namespacing -Puck has a rich module system, inspired by such expressive systems in the ML family of languages, notably including Rust and OCaml. Unlike these systems, however, opening modules i.e. unqualified imports is *encouraged* - even in the global scope - which at first would appear to run contrary to the point of a module system. Puck cleans up such "namespace pollution" by **type-based disambiguation**. <!-- Puck allows for the usage of such unqualified identifiers by **type-based disambiguation**. --> +Puck has a rich module system, inspired by such expressive systems in the ML family of languages, notably including Rust and OCaml. Unlike these systems, however, opening modules i.e. unqualified imports is *encouraged* - even in the global scope - which at first would appear to run contrary to the point of a module system. Such "namespace pollution" is made a non-issue by **type-based disambiguation** (and can be avoided regardless with qualified imports anyway). -A major goal of Puck's module system is to allow the same level of expressiveness as the ML family - while cutting down on the extraneous syntax and boilerplate needed to do so. As such, modularity features are written directly inline with their declaration, and the file system structure is reused to form an implicit module system for internal use. +A major goal of Puck's module system is to allow the same level of expressiveness as the ML family - while cutting down on the extraneous syntax and boilerplate needed to do so. As such, access modifiers are written directly inline with their declaration, and the file system structure is reused to form an implicit module system for internal use. ```puck import std/[ascii, unicode] @@ -22,7 +22,7 @@ These unqualified imports by default may seem like madness to a Python or C prog ```puck ``` -Multiple modules can be imported in the same scope, of course, and so conflicts may arise on imports. In the pursuit of *some* explicitness, no attempt is made to guess the proper identifier from usage. These must be disambiguated by prefixing the module name, followed by a dot. This disambiguation breaks uniform function call syntax on functions: yet because functions only conflict when both their name and entire function signature overlap, this is a rare occurrence. If so desired, an import followed by `/[]` will force full qualification of all identifiers in the module - yet this is an antipattern and not recommended. +Multiple modules can be imported in the same scope, and so conflicts may arise on imports. In the pursuit of *some* explicitness, no attempt is made to guess the proper identifier from usage. These must be disambiguated by prefixing the module name, followed by a dot. This disambiguation breaks uniform function call syntax on functions: yet because functions only conflict when both their name and entire function signature overlap, this is a rare occurrence. If so desired, an import followed by `/[]` will force full qualification of all identifiers in the module - yet this is an antipattern and not recommended. Unrelated to the module system - but note that functions only differing in return type are allowed, albeit discouraged. Extra type annotations may be needed for the compiler to properly infer in such cases. diff --git a/docs/TYPES.md b/docs/TYPES.md index b16c508..e674739 100644 --- a/docs/TYPES.md +++ b/docs/TYPES.md @@ -8,7 +8,7 @@ Basic types can be one-of: - `bool`: internally an enum. - `int`: integer number. x bits of precision by default. - `uint`: same as `int`, but unsigned for more precision. - - `i8`, `i16`, `i32`, `i64`, `i28`: specified integer size + - `i8`, `i16`, `i32`, `i64`, `i128`: specified integer size - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size - overflows into bigints for safety and ease of cryptographical code. - `float`: floating-point number. @@ -33,8 +33,14 @@ todo ### strings -todo +Strings are: +- mutable +- internally a byte (uint8) array +- externally a char (uint32) array +- prefixed with their length and capacity +- automatically resize like a list +They are also quite complicated. Puck has full support for Unicode and wishes to be intuitive, performant, and safe, as all languages wish to be. Strings present a problem that much effort has been spent on in Swift and Rust (primarily) to solve. ## Container types Container types, broadly speaking, are types that contain other types. These exclude the types in [advanced types](#advanced-types). @@ -78,12 +84,12 @@ No variables may be assigned these types, nor may any function return them. These are monomorphized into more specific functions at compile-time if needed. Parameter types can be one-of: -- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. +- mutable: `func foo(a: mut str)`: Denotes the mutability of a parameter. Parameters are immutable by default. - Passed as a `ref` if not one already, and marked mutable. -- static: `func foo(a: static str)`: Denotes a parameter that's value must be known at compile-time. Useful with `when` for writing generic code. +- static: `func foo(a: static str)`: Denotes a parameter whose value must be known at compile-time. Useful with `when` for writing generic code. - generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. - constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. - - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). + <!-- - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). --> - functions: `func foo(a: func (x, y: int): int)`: First-class functions. - Functions may be prefixed with modifiers: one-of `pure`, `yeet`, `async`. - Syntactic sugar is available: `func foo(a: (int, int) -> int)`. This is not usable with modifiers. @@ -95,7 +101,7 @@ Parameter types can be one-of: These parameter types (except `static`) share a common trait: they are not *sized*. The exact type is not generally known until compilation - and in the case of interfaces, sometimes not even during compilation! As the size is not always rigorously known, problems arise when attempting to construct parameter types or compose them with other types: and so this is disallowed. Not all is lost, however, as they may still be used with *indirection* - detailed in the [section on reference types](#reference-types). -### Generic types +### generic types Functions can take a _generic_ type, that is, be defined for a number of types at once: @@ -112,7 +118,7 @@ func length[T: str | list](a: T) = return a.len ``` -The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). +The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may then refer to the generic types). Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. @@ -123,7 +129,7 @@ Types are typically constructed by value on the stack. That is, without any leve Reference types can be one-of: - `ref T`: An automatically-managed reference to type `T`. This is a pointer of size `uint` (native). -- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. +- `ptr T`: A manually-managed pointer to type `T`. (very) unsafe. The compiler will yell at you. ```puck type BinaryTree = ref struct @@ -170,6 +176,8 @@ Tuples are an *ordered* collection of either named or unnamed types. They are declared with `tuple[Type, identifier: Type, ...]` and initialized with parentheses: `(413, "hello", value: 40000)`. +They are exclusively ordered - named types within tuples are just syntax sugar for positional access. Passing a fully unnamed tuple into a context that expects a tuple with a named parameter is allowed so long as the types line up in order. + ```puck ``` @@ -248,42 +256,34 @@ import std/tables func eval(expr: Expr, context: var HashTable[Ident, Value]): Result[Value] match expr - of Literal(value): - value - of Variable(ident): - context.get(ident) - of Application{body, arg}: - if body of Abstraction{param, body as inner_body}: - context.set(param, eval(arg)) - inner_body.eval(context) - else: - Error(InvalidExpr) - of Conditional{cond, then_case, else_case}: - if eval(cond, context): - then_case.eval(context) - else: - else_case.eval(context) - of _: + of Literal(value): value + of Variable(ident): context.get(ident) + of Application{body, arg}: + if body of Abstraction{param, body as inner_body}: + context.set(param, eval(arg)) + inner_body.eval(context) + else: Error(InvalidExpr) + of Conditional{cond, then_case, else_case}: + if eval(cond, context): + then_case.eval(context) + else: + else_case.eval(context) + of _: + Error(InvalidExpr) ``` -The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity; but the `variable of Type(x)` syntax can be reused as a conditional, in `if` statements and elsewhere. - -The `of` operator is similar to the `is` operator in that it queries type equality. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. +The match statement takes exclusively a list of `of` sub-expressions, and checks for exhaustivity; but the `variable of Type(binding)` syntax can be reused as a conditional, in `if` statements and elsewhere. Each branch of a match expression can have a *guard*: an arbitrary conditional that must be met in order for it to match. Guards are written as `where cond` and immediately follow the last pattern in an `of` branch, preceding the colon. -When matching unions with an inner product type (structs or tuples), external extraneous parenthesis are elided. +The `of` *operator* is similar to the `is` operator in that it queries type equality, returning a boolean. However, unbound identifiers within `of` expressions are bound to appropriate values (if matched) and injected into the scope. This allows for succinct handling of `union` types in situations where `match` is overkill. -todo: guards, either `where` or `if` +`match` expressions and `of` operators have some special rules. When matching unions with an inner product type (structs or tuples), external extraneous parenthesis are elided. todo: others? ### interfaces -Interfaces can be thought of as analogous to Rust's traits, without explicit `impl` blocks and without need for the `derive` macro. +Interfaces can be thought of as analogous to Rust's traits, without explicit `impl` blocks and without need for the `derive` macro. Types that have functions fulfilling the interface requirements implicitly implement the associated interface. -The `interface` type is composed of a list of function signatures that refer to the special type `Self` that must exist for a type to be valid. -Interfaces cannot be constructed and so are only of use as parameter types and the like, and so are always used in a *type conversion*. -The special type `Self` is replaced with the type being converted at compile time in order to typecheck. - -They are declared with `interface[signature, ...]`. +The `interface` type is composed of a list of function signatures that refer to the special type `Self` that must exist for a type to be valid. The special type `Self` is replaced with the concrete type at compile time in order to typecheck. They are declared with `interface[signature, ...]`. ```puck type Stack[T] = interface @@ -295,13 +295,20 @@ func takes_any_stack(stack: Stack[int]) = # only stack.push, stack.pop, and stack.peek are available methods ``` -Differing from Rust, Haskell, and others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this fine and idiomatic. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. +Differing from Rust, Haskell, and many others, there is no explicit `impl` block. If there exists a function that matches one of an interface's signatures, it is considered to match and the interface typechecks. This may seem strange and ambiguous - but again, static typing and uniform function call syntax help make this reasonable. The purpose of explicit `impl` blocks in ex. Rust is two-fold: to explicitly group together associated code, but primarily to provide a limited form of uniform function call syntax. As any function can be called as either `foo(a, b)` or `a.foo(b)`, `impl` blocks would only serve to group together associated code: which is better done with modules. + +Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling an interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. -Interfaces cannot be constructed because they are not sized. They serve purely as a list of valid operations on a type within a context: no information about their memory layout is relevant. The concrete type fulfilling the interface is known at compile time, however, and so there are no issues surrounding interfaces as parameters, just when attempted to be used as (part of) a concrete type. They can be used as part of a concrete type with *indirection*, however: `type Foo = struct[a: int, b: ref interface[...]]` is perfectly valid. +Interfaces compose with [modules](MODULES.md) to offer fine grained access control. -While functions are the primary way of performing operations on types, they are not the only way, and listing all explicitly can be painful - instead, it can be desired to be able to *associate a type* and any field access or existing functions on that type with the interface. todo: i have not decided on the syntax for this yet. +todo: I have not decided whether the names of parameters is / should be relevant, or enforcable, or present. I'm leaning towards them not being present. But if they are enforcable, it makes it harder to implicitly implement the wrong interface. Design notes to consider: https://blog.rust-lang.org/2015/05/11/traits.html -Interfaces also compose with [the module system](MODULES.md) to offer fine grained crafting of types. +### generic types + +Types, like functions, can be *generic*: defined for an unknown arbitrary type, monomorphized at compile time. Indeed, we have already seen them before: in the above interface example. The syntax much follows the syntax for generic functions. + +```puck +``` ### distinct types |