From 9a5d96f78a13cb14aa89ea950d2c45ccb6c8f203 Mon Sep 17 00:00:00 2001 From: JJ Date: Tue, 23 May 2023 13:29:54 -0700 Subject: cleaner thoughts on types and create an overview-esque readme --- README.md | 66 +++++++++++++++++++++++++++++++++++++++ TYPES.md | 104 ++++++++++++++++++++++++++++++++++++-------------------------- 2 files changed, 126 insertions(+), 44 deletions(-) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..ee6d4a3 --- /dev/null +++ b/README.md @@ -0,0 +1,66 @@ +# 🧚 puck - an experimental programming language + +A place where I can make some bad decisions. + +Puck is an experimental, memory safe, strongly typed, multi-paradigm programming language. +It aims to be clean and succinct while performant: having the ease of use of [Python](https://www.python.org/) with the performance/safety guarantees of [Rust](https://www.rust-lang.org/) and the flexibility of [Nim](https://nim-lang.org/). + +You may judge for yourself if Puck meets these ideals. + +## Why Puck? + +Puck is primarily a testing ground and should not be used in any important capacity. +Don't use it. Everything is unimplemented and it will break underneath your feet. + +That said: in the future, once somewhat stabilized, reasons why you *would* use it would be for: +- The **interop system**, allowing foreign functions to be usable with native semantics +- The **effect system**, being one of few languages with a proper effects system based on handlers +- The **type system**, being modern and powerful with a strong emphasis on safety and algebraic data types +- The **memory management system**, +- The **metaprogramming**, providing macros more powerful than Rust and on the level of Nim. You may operate directly on the abstract syntax tree either before or after typechecking. +- The **syntax**, aiming to be flexible, predictable, and succinct, through the use of *uniform function call syntax* and significant whitespace + +## Why _not_ $LANG? + +Why not Python? +- Python is slow and weakly typed, but with an excellent ecosystem and ergonomics. +- Puck is fast and strongly typed, and Python libraries are usable from Puck! + +Why not Rust? +- Rust prioritizes performance and safety above all else. +- It then proceeds to prioritize ergonomics: but this can be *better*. +- Puck prioritizes safety, ergonomics, and performance, in that order. + +Why not Nim? +- Nim is excellent and does many things right (in the view of the author). +- But it's held back by some technical debt: primarily around the C interop, which is functional but clunky. Puck, instead, compiles to LLVM and provides language interop through type system interop. +- Nim also puts much less of an emphasis on safety in the standard library. Functions may panic and throw exceptions: optional types exist but are scarcely used. + +Why not Swift? +- Swift is functionally an Apple-exclusive language. + +Why not Koka? +- Koka's focus is effect systems: Puck's focus is language interop. + +## How do I learn more? + +- The [basic usage](BASIC.md) document lays out the fundamental grammar of Puck. +- The [syntax](SYNTAX.md) document provides a deeper look into the syntax choices. +- The [grammar](GRAMMAR.md) document provides a formal grammar for Puck, which the parser is based upon. +- The [type system](TYPES.md) document gives an in-depth analysis of Puck's extensive type system, and its relationship to classes and other abstractions. +- The [memory management](MEMORY_MANAGEMENT.md) document gives an overview of Puck's memory model: which is considered a mashup of the models pioneered by Lobster, Rust, and Nim. +- The [effect system](EFFECTS.md) document gives a description of Puck's effects system: and how it ties into asynchronous code and exceptions. +- The [interop](INTEROP.md) document gives an overview of how the first-class language interop system works. +- The [metaprogramming](METAPROGRAMMING.md) document explains how using metaprogramming to extend the language and write more powerful code works. +- The [syntax](SYNTAX.md) document provides a more detailed look at the syntax of the language. +- The [modules](MODULES.md) document provides a more detailed look at imports and how they relate to the type system. +- The [roadmap](ROADMAP.md) provides a clear view of the current state and future plans of the language's development. +- The [standard library](STDLIB.md) document provides an overview and examples of usage of the standard library. +- The [extended library](EXTLIB.md) document provides an overview and examples of usage of the _extended_ standard library. + + +Note that all of these documents (and parts of this README) are written as if everything already exists. Nothing already exists! You can see the [roadmap](ROADMAP.md) for an actual sense as to the state of the language. I just found writing in the present tense to be an easier way to collect my thoughts. + +## Acknowledgements + +First and foremost, this language is _heavily_ inspired by Nim. Many ideas - general syntax, UFCS, even the design of the compiler - were taken 1:1. The memory management algorithm comes from Lobster (which is also implemented by Nim). Performance annotations are somewhat inspired by Nim. The underlying type system and use of is mostly copied from Rust with minor changes. The effects system is unique, with inspiration from the few languages successfully implementing effects systems, namely Koka and Unison. diff --git a/TYPES.md b/TYPES.md index d017597..811c7b0 100644 --- a/TYPES.md +++ b/TYPES.md @@ -12,14 +12,15 @@ Basic types can be one-of: - `u8`, `u16`, `u32`, `u64`, `u128`: specified integer size - `float`: floating-point number. - `f32`, `f64`: specified float sizes -- `char`: distinct 0-127 character, for working with ascii. +- `char`: a distinct 0-127 character. For working with ascii. - `rune`: a Unicode character. - `str`: a string type. mutable. internally a char-array? must also support unicode. - `void`: an internal type designating the absence of a value. + - possibly, the empty tuple. then would `empty` be better? - `never`: a type that denotes functions that do not return. distinct from returning nothing. -- `empty`: possibly? in place of the empty tuple? the empty tuple has always bothered me + - the bottom type. -`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]], unless specifically passed with `mut/ref` type {undecided}. +`bool`, `int`/`uint` and siblings, `float` and siblings, `char`, and `rune` are all considered **primitive types** and are _always_ [[copied]] (unless passed as `var`). Basic types as a whole include the primitive types, as well as `str`, `void`, and `never`. Basic types can further be broken down into the following categories: - boolean types: `bool` @@ -27,7 +28,7 @@ Basic types as a whole include the primitive types, as well as `str`, `void`, an - textual types: `char`, `rune`, `str` - funky types: `void`, `never` -Funky types will rarely be used by name: instead, the absence of a type typically denotes one or the other. Still, having a name is helpful in some situations: like with [[concepts]]. +Funky types will rarely be referenced by name: instead, the absence of a type typically implicitly denotes one or the other. Still, having a name is helpful in some situations: like with [[concepts]]. ## Function types @@ -35,75 +36,90 @@ Functions can also be types. - `func(T, U): V`: denotes a type that is a function taking arguments of type T and U and returning a value of type V. - The syntactical sugar `(T, U) -> (V)` is available, to consolidate type declarations and disambiguate when dealing with many `:`s. Is this a good idea? should i universally use `:` or `->`? -Aside: How should we handle pure functions? fold it into a more powerful Effects system? probably. +## Container types -## Container types / Sequence types / Iterable types +Container types, broadly speaking, are types that contain other types. These exclude the types in [[advanced types]]. + +### Iterable types Iterable types can be one-of: -- `array[T]`: Static arrays. Can only contain one type `T`. -- `list[T]`: Dynamic arrays. Can only contain one type `T`. -- `tuple[T, U, V...]`: n-tuples. May contain multiple types, however, any given position in the tuple must be of a known type. -- `slice[T]`, where `T` is a container type. This represents a "view" into some sequence of elements of type `T`. - - Tuples are a special case: when iterated, they expose themselves as a sequence of `TupleElement`s wrapping their inner types. This allows us to assert a single type for inner elements (which then must be matched if one is to do anything productive with them). +- `array[S, T]`: Static arrays. Can only contain one type `T`. Of size `S` and cannot grow/shrink. + - Initialize in-place with `array(a, b, c)`. Should we do this? otherwise `[a, b, c]`. +- `list[T]`: Dynamic arrays. Can only contain one type `T`. May grow/shrink dynamically. + - Initialize in-place with `list(a, b, c)`. Should we do this? otherwise `@[a, b, c]`. +- `slice[T]`: Slices. Used to represent a "view" into some sequence of elements of type `T`. + - Cannot be directly constructed. May be initialized from an array, list, or string, or may be used as a generic parameter on functions (more on that later). + - Slices cannot grow/shrink. Their elements may be accessed and mutated. As they are underlyingly a reference to an array or list, they **must not** outlive the data they reference. - `str`: Strings. Contain the `rune` type or alternatively `char`s or `bytes`?? {undecided} -All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim: but hopefully a bit more powerful? +All of these above types are some sort of sequence: and so have a length, and so can be _iterated_. +For convenience, a special `iterable` generic type is defined for use in parameters: that abstracts over all of the container types. This `iterable` type is also extended to any collection with a length of a single type (and also tuples). It is functionally equivalent to the `openarray` type in Nim: but hopefully a bit more powerful? - Aside: how do we implement this? rust-style (impl `iter()`), or monomorphize the hell out of it? i think compiler magic is the way to go for specifically this... - Aside: `iterable` may need a better name. it implies iterators right now which it is distinctly Unrelated to. unless i don't have iterators? that may be the way to go... -Aside: i really need to be consistent between sequence, collection, iterable, container. - Elements of container types can be accessed by the `container[index]` syntax. Slices of container types can be accessed by the `container[lowerbound..upperbound]` syntax. Slices of non-consecutive elements can be accessed by the `container[a,b,c..d]` syntax, and indeed, the previous example expands to these. They can also be combined: `container[a,b,c..d]`. - Aside: take inspiration from Rust here? they make it really safe if a _little_ inconvenient -Probably better as external data structures: -- `set[T]` / `bitset[T]` / `hashset[T]` -- `dict[T]` / `table[T]` / `hashmap[T]` +### Abstract data types -### Aside: laziness +There are an additional suite of related types: abstract data types. While falling under container types, these do not have a necessarily straightforward or best implementation, and so multiple implementations are provided. -Quite a few functions, mostly those related to functional programming of some sort, benefit from _laziness_: not computing the result of a function applied to a value instantly, and instead waiting until this is _collected_ by a special `collect` function (in Haskell, just being used will do: we're not quite that lazy). +Abstract data types can be one-of: +- `set[T]`: high-performance sets implemented as a bit array. + - These have a maximum data size, at which point the compiler will suggest using a `HashSet[T]` instead. +- `table[T, U]`: simple symbol tables implemented as an association list. + - These do not have a maximum size. However, at some point the compiler will suggest using a `HashTable[T, U]` instead. +- `HashSet[T]`: standard hash sets. +- `HashTable[T, U]`: standard hash tables. -Rust does laziness via types: i.e. a map function returns a Map wrapper, which then you can map again on to get a `Map>` and so on and so forth with a variety of types. Particularly for maps, this allows for computing all of the functions applied through a single iteration, rather than the 4-5-6-etc iterations that an eager implementation may need. (is this a monad? i think this is a monad) +Unlike iterable types, abstract data types are not iterable by default: as they are not ordered, and thus, it is not clear how they should be iterated. Despite this: for utility purposes, an `elems()` iterator based on a normalization of the elements is provided for `set` and `HashSet`, and `keys()`, `values()`, and `pairs()` iterators are provided for `table` and `HashTable` based on a normalization of the keys. This is deterministic to prevent user reliance on shoddy randomization, see Golang. -Nim _doesn't_ do laziness: to the detriment of its `sequtils` library, whose functions simply chain naively. External libraries overcome this, but have to resort to extensive syntax tree rewriting with macros (which isn't bad in and of itself! the point of macros is to be able to implement language-level features that are lacking, but wouldn't it be nice if we baked in those features beforehand?). I think that Rust's approach is Really Quite Good in comparison. +## Parameter types -I think Rust's approach is primarily made possible by traits (map implements the Iterable trait or the Collectable trait or whatever). It should be doable with concepts: and be a good baseline test for if concepts are a usable replacement for traits. +Some types are only valid when being passed to a function, or in similar contexts. +No variables may be assigned these types, nor may any function return them. +These are monomorphized into more specific functions at compile-time if needed. + +Parameter types can be one-of: +- generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. +- constrained: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. + - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is preoccupied with strings). +- mutable: `func foo(a: var str)`: Denotes the mutability of a parameter. Parameters are immutable by default. + - Passed as a `ref` if not one already, and marked mutable. +- a built-in typeclass: `func foo[T](a: slice[T])`: Included, special typeclasses for being generic over [[advanced types]]. + - Of note is how `slice[T]` functions: it is generic over `lists` and `arrays` of any length. + +### Generic types + +Functions can take a _generic_ type, that is, be defined for a number of types at once: ``` -type Iterable: concept = - Iterable.iter() is Iter[T] -# or depending on syntax -type Iterable: concept[T] = - Iterable.iter() is Iter[T] -type Iterable[T]: concept = - Iterable[T].iter() is Iter[T] -type Collectable[T]: concept = - Collectable[T].collect() is T -``` +func add[T](a: list[T], b: T) = + return a.add(b) -## Generic types / Parameter types +func length[T](a: T) = + return a.len # monomorphizes based on usage. + # lots of things use .len, but only a few called by this do. + # throws a warning if exported for lack of specitivity. -Some types are only valid when being passed to a function, or in similar contexts. -No variables may be assigned these types, nor may any function return them. -These are monomorphized into more specific functions at compile-time. +func length[T: str | list](a: T) = + return a.len +``` -Parameter types can be one-of: -- Generic: `func foo[T](a: list[T], b: T)`: The standard implementation of generics, where a parameter's exact type is not listed, and instead statically dispatched based on usage. -- Options: `func foo(a: str | int | float)`: A basic implementation of generics, where a parameter can be one-of several listed types. Makes for particularly straightforward monomorphization. - - Separated with the bitwise or operator `|` rather than the symbolic or `||` or a raw `or` to give the impression that there isn't a corresponding "and" operation (the `&` operator is occupied with strings). -- Iterable: `func foo(a: iterable)`: the special "iterable" type mentioned above. +The syntax for generics is `func`, `ident`, followed by the names of the generic parameters in brackets `[T, U, V]`, followed by the function's parameters (which may refer to the generic types). +Generics are replaced with concrete types at compile time (monomorphization) based on their usage in functions within the main function body. + +Constrained generics have two syntaxes: the constraint can be directly on a parameter, leaving off the `[T]` box, or it may be defined within the box as `[T: int | float]` for easy reuse in the parameters. ## Advanced Types The `type` keyword is used to declare custom types. Custom types can be one-of: -- `struct`/`object`: A collection of types, that may have default values - - potentially: `struct` vs. `object` denote types passed by reference/value? +- `tuple`: An ordered collection of types. Optionally named. +- `struct`: An unordered, named collection of types. May have default values. - `enum`: Powerful algebraic data types a la Rust. - undecided: ordinal by default, which sacrifice their ordinality when assigned static values? probably a bad idea. could be ordinal by ordering in declaration. i think rust is ordinal by alphabetical order, which is weird?? i wonder why -- `tuple`: See [[tuples]], earlier. - `concept`: typeclasses. they have some unique syntax - `distinct`: a type that must be explicitly converted - useful for preventing sql injections, ascii/unicode fuckups, etc -- cgit v1.2.3-70-g09d2