aboutsummaryrefslogtreecommitdiff
path: root/book/src/guides
diff options
context:
space:
mode:
authorSkyler Hawthorne2023-07-19 13:07:36 +0000
committerBlaž Hrastnik2023-08-10 21:22:22 +0000
commit929eb0c39e34f8046b5ec9ecfede4ec80b5e0c8a (patch)
treed436a004b54ce384000787606ece64d59c63b419 /book/src/guides
parent7078e8400736dce923be44a4d26f136a22640f93 (diff)
expand indents guide
Diffstat (limited to 'book/src/guides')
-rw-r--r--book/src/guides/indent.md347
1 files changed, 304 insertions, 43 deletions
diff --git a/book/src/guides/indent.md b/book/src/guides/indent.md
index b660d785..0b0e3938 100644
--- a/book/src/guides/indent.md
+++ b/book/src/guides/indent.md
@@ -1,76 +1,293 @@
# Adding indent queries
-Helix uses tree-sitter to correctly indent new lines. This requires
-a tree-sitter grammar and an `indent.scm` query file placed in
-`runtime/queries/{language}/indents.scm`. The indentation for a line
-is calculated by traversing the syntax tree from the lowest node at the
-beginning of the new line. Each of these nodes contributes to the total
-indent when it is captured by the query (in what way depends on the name
-of the capture).
+Helix uses tree-sitter to correctly indent new lines. This requires a tree-
+sitter grammar and an `indent.scm` query file placed in `runtime/queries/
+{language}/indents.scm`. The indentation for a line is calculated by traversing
+the syntax tree from the lowest node at the beginning of the new line (see
+[Indent queries](#indent-queries)). Each of these nodes contributes to the total
+indent when it is captured by the query (in what way depends on the name of
+the capture.
Note that it matters where these added indents begin. For example,
multiple indent level increases that start on the same line only increase
-the total indent level by 1.
+the total indent level by 1. See [Capture types](#capture-types).
-## Scopes
+## Indent queries
-Added indents don't always apply to the whole node. For example, in most
-cases when a node should be indented, we actually only want everything
-except for its first line to be indented. For this, there are several
-scopes (more scopes may be added in the future if required):
+When Helix is inserting a new line through `o`, `O`, or `<ret>`, to determine
+the indent level for the new line, the query in `indents.scm` is run on the
+document. The starting position of the query is the end of the line above where
+a new line will be inserted.
-- `all`:
-This scope applies to the whole captured node. This is only different from
-`tail` when the captured node is the first node on its line.
+For `o`, the inserted line is the line below the cursor, so that starting
+position of the query is the end of the current line.
-- `tail`:
-This scope applies to everything except for the first line of the
-captured node.
+```rust
+fn need_hero(some_hero: Hero, life: Life) -> {
+ matches!(some_hero, Hero { // ←─────────────────╮
+ strong: true,//←╮ ↑ ↑ │
+ fast: true, // │ │ ╰── query start │
+ sure: true, // │ ╰───── cursor ├─ traversal
+ soon: true, // ╰──────── new line inserted │ start node
+ }) && // │
+// ↑ │
+// ╰───────────────────────────────────────────────╯
+ some_hero > life
+}
+```
-Every capture type has a default scope which should do the right thing
-in most situations. When a different scope is required, this can be
-changed by using a `#set!` declaration anywhere in the pattern:
-```scm
-(assignment_expression
- right: (_) @indent
- (#set! "scope" "all"))
+For `O`, the newly inserted line is the *current* line, so the starting position
+of the query is the end of the line above the cursor.
+
+```rust
+fn need_hero(some_hero: Hero, life: Life) -> { // ←─╮
+ matches!(some_hero, Hero { // ←╮ ↑ │
+ strong: true,// ↑ ╭───╯ │ │
+ fast: true, // │ │ query start ─╯ │
+ sure: true, // ╰───┼ cursor ├─ traversal
+ soon: true, // ╰ new line inserted │ start node
+ }) && // │
+ some_hero > life // │
+} // ←──────────────────────────────────────────────╯
```
-## Capture types
+From this starting node, the syntax tree is traversed up until the root node.
+Each indent capture is collected along the way, and then combined according to
+their [capture types](#capture-types) and [scopes](#scopes) to a final indent
+level for the line.
-- `@indent` (default scope `tail`):
-Increase the indent level by 1. Multiple occurrences in the same line
-don't stack. If there is at least one `@indent` and one `@outdent`
-capture on the same line, the indent level isn't changed at all.
+### Capture types
+- `@indent` (default scope `tail`):
+ Increase the indent level by 1. Multiple occurrences in the same line *do not*
+ stack. If there is at least one `@indent` and one `@outdent` capture on the
+ same line, the indent level isn't changed at all.
- `@outdent` (default scope `all`):
-Decrease the indent level by 1. The same rules as for `@indent` apply.
-
+ Decrease the indent level by 1. The same rules as for `@indent` apply.
+- `@indent.always` (default scope `tail`):
+ Increase the indent level by 1. Multiple occurrences on the same line *do*
+ stack. The final indent level is `@indent.always` – `@outdent.always`. If
+ an `@indent` and an `@indent.always` are on the same line, the `@indent` is
+ ignored.
+- `@outdent.always` (default scope `all`):
+ Decrease the indent level by 1. The same rules as for `@indent.always` apply.
- `@extend`:
-Extend the range of this node to the end of the line and to lines that
-are indented more than the line that this node starts on. This is useful
-for languages like Python, where for the purpose of indentation some nodes
-(like functions or classes) should also contain indented lines that follow them.
-
+ Extend the range of this node to the end of the line and to lines that are
+ indented more than the line that this node starts on. This is useful for
+ languages like Python, where for the purpose of indentation some nodes (like
+ functions or classes) should also contain indented lines that follow them.
- `@extend.prevent-once`:
-Prevents the first extension of an ancestor of this node. For example, in Python
-a return expression always ends the block that it is in. Note that this only stops the
-extension of the next `@extend` capture. If multiple ancestors are captured,
-only the extension of the innermost one is prevented. All other ancestors are unaffected
-(regardless of whether the innermost ancestor would actually have been extended).
+ Prevents the first extension of an ancestor of this node. For example, in Python
+ a return expression always ends the block that it is in. Note that this only
+ stops the extension of the next `@extend` capture. If multiple ancestors are
+ captured, only the extension of the innermost one is prevented. All other
+ ancestors are unaffected (regardless of whether the innermost ancestor would
+ actually have been extended).
+
+#### `@indent` / `@outdent`
+
+Consider this example:
+
+```rust
+fn shout(things: Vec<Thing>) {
+ // ↑
+ // ├───────────────────────╮ indent level
+ // @indent ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄
+ // │
+ let it_all = |out| { things.filter(|thing| { // │ 1
+ // ↑ ↑ │
+ // ├───────────────────────┼─────┼┄┄┄┄┄┄┄┄┄┄┄┄┄┄
+ // @indent @indent │
+ // │ 2
+ thing.can_do_with(out) // │
+ })}; // ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄
+ //↑↑↑ │ 1
+} //╰┼┴──────────────────────────────────────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄
+// 3x @outdent
+```
+
+```scm
+((block) @indent)
+["}" ")"] @outdent
+```
+
+Note how on the second line, we have two blocks begin on the same line. In this
+case, since both captures occur on the same line, they are combined and only
+result in a net increase of 1. Also note that the closing `}`s are part of the
+`@indent` captures, but the 3 `@outdent`s also combine into 1 and result in that
+line losing one indent level.
+
+#### `@extend` / `@extend.prevent-once`
+
+For an example of where `@extend` can be useful, consider Python, which is
+whitespace-sensitive.
+
+```scm
+]
+ (parenthesized_expression)
+ (function_definition)
+ (class_definition)
+] @indent
+
+```
+
+```python
+class Hero:
+ def __init__(self, strong, fast, sure, soon):# ←─╮
+ self.is_strong = strong # │
+ self.is_fast = fast # ╭─── query start │
+ self.is_sure = sure # │ ╭─ cursor │
+ self.is_soon = soon # │ │ │
+ # ↑ ↑ │ │ │
+ # │ ╰──────╯ │ │
+ # ╰─────────────────────╯ │
+ # ├─ traversal
+ def need_hero(self, life): # │ start node
+ return ( # │
+ self.is_strong # │
+ and self.is_fast # │
+ and self.is_sure # │
+ and self.is_soon # │
+ and self > life # │
+ ) # ←─────────────────────────────────────────╯
+```
+
+Without braces to catch the scope of the function, the smallest descendant of
+the cursor on a line feed ends up being the entire inside of the class. Because
+of this, it will miss the entire function node and its indent capture, leading
+to an indent level one too small.
+
+To address this case, `@extend` tells helix to "extend" the captured node's span
+to the line feed and every consecutive line that has a greater indent level than
+the line of the node.
+
+```scm
+(parenthesized_expression) @indent
+
+]
+ (function_definition)
+ (class_definition)
+] @indent @extend
+
+```
+
+```python
+class Hero:
+ def __init__(self, strong, fast, sure, soon):# ←─╮
+ self.is_strong = strong # │
+ self.is_fast = fast # ╭─── query start ├─ traversal
+ self.is_sure = sure # │ ╭─ cursor │ start node
+ self.is_soon = soon # │ │ ←───────────────╯
+ # ↑ ↑ │ │
+ # │ ╰──────╯ │
+ # ╰─────────────────────╯
+ def need_hero(self, life):
+ return (
+ self.is_strong
+ and self.is_fast
+ and self.is_sure
+ and self.is_soon
+ and self > life
+ )
+```
+
+Furthermore, there are some cases where extending to everything with a greater
+indent level may not be desirable. Consider the `need_hero` function above. If
+our cursor is on the last line of the returned expression.
+```python
+class Hero:
+ def __init__(self, strong, fast, sure, soon):
+ self.is_strong = strong
+ self.is_fast = fast
+ self.is_sure = sure
+ self.is_soon = soon
+
+ def need_hero(self, life):
+ return (
+ self.is_strong
+ and self.is_fast
+ and self.is_sure
+ and self.is_soon
+ and self > life
+ ) # ←─── cursor
+ #←────────── where cursor should go on new line
+```
+
+In Python, the are a few tokens that will always end a scope, such as a return
+statement. Since the scope ends, so should the indent level. But because the
+function span is extended to every line with a greater indent level, a new line
+would just continue on the same level. And an `@outdent` would not help us here
+either, since it would cause everything in the parentheses to become outdented
+as well.
+
+To help, we need to signal an end to the extension. We can do this with
+`@extend.prevent-once`.
+
+```scm
+(parenthesized_expression) @indent
+
+]
+ (function_definition)
+ (class_definition)
+] @indent @extend
+
+(return_statement) @extend.prevent-once
+```
+
+#### `@indent.always` / `@outdent.always`
+
+As mentioned before, normally if there is more than one `@indent` or `@outdent`
+capture on the same line, they are combined.
+
+Sometimes, there are cases when you may want to ensure that every indent capture
+is additive, regardless of how many occur on the same line. Consider this
+example in YAML.
+
+```yaml
+ - foo: bar
+# ↑ ↑
+# │ ╰─────────────── start of map
+# ╰───────────────── start of list element
+ baz: quux # ←─── cursor
+ # ←───────────── where the cursor should go on a new line
+ garply: waldo
+ - quux:
+ bar: baz
+ xyzzy: thud
+ fred: plugh
+```
+
+In YAML, you often have lists of maps. In these cases, the syntax is such that
+the list element and the map both start on the same line. But we really do want
+to start an indentation for each of these so that subsequent keys in the map
+hang over the list and align properly. This is where `@indent.always` helps.
+
+```scm
+((block_sequence_item) @item @indent.always @extend
+ (#not-one-line? @item))
+
+((block_mapping_pair
+ key: (_) @key
+ value: (_) @val
+ (#not-same-line? @key @val)
+ ) @indent.always @extend
+)
+```
## Predicates
In some cases, an S-expression cannot express exactly what pattern should be matched.
For that, tree-sitter allows for predicates to appear anywhere within a pattern,
similar to how `#set!` declarations work:
+
```scm
(some_kind
(child_kind) @indent
(#predicate? arg1 arg2 ...)
)
```
+
The number of arguments depends on the predicate that's used.
Each argument is either a capture (`@name`) or a string (`"some string"`).
The following predicates are supported by tree-sitter:
@@ -91,3 +308,47 @@ argument (a string).
- `#same-line?`/`#not-same-line?`:
The captures given by the 2 arguments must/must not start on the same line.
+
+- `#one-line?`/`#not-one-line?`:
+The captures given by the fist argument must/must span a total of one line.
+
+### Scopes
+
+Added indents don't always apply to the whole node. For example, in most
+cases when a node should be indented, we actually only want everything
+except for its first line to be indented. For this, there are several
+scopes (more scopes may be added in the future if required):
+
+- `tail`:
+This scope applies to everything except for the first line of the
+captured node.
+- `all`:
+This scope applies to the whole captured node. This is only different from
+`tail` when the captured node is the first node on its line.
+
+For example, imagine we have the following function
+
+```rust
+fn aha() { // ←─────────────────────────────────────╮
+ let take = "on me"; // ←──────────────╮ scope: │
+ let take = "me on"; // ├─ "tail" ├─ (block) @indent
+ let ill = be_gone_days(1 || 2); // │ │
+} // ←───────────────────────────────────┴──────────┴─ "}" @outdent
+ // scope: "all"
+```
+
+We can write the following query with the `#set!` declaration:
+
+ ```scm
+ ((block) @indent
+ (#set! "scope" "tail"))
+ ("}" @outdent
+ (#set! "scope" "all"))
+ ```
+
+As we can see, the "tail" scope covers the node, except for the first line.
+Everything up to and including the closing brace gets an indent level of 1.
+Then, on the closing brace, we encounter an outdent with a scope of "all", which
+means the first line is included, and the indent level is cancelled out on this
+line. (Note these scopes are the defaults for `@indent` and `@outdent`—they are
+written explicitly for demonstration.) \ No newline at end of file