aboutsummaryrefslogtreecommitdiff
path: root/book/src/guides/indent.md
blob: be140384a1fede5983affe7f45ce4caa2df544ed (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
# Adding indent queries

Helix uses tree-sitter to correctly indent new lines. This requires a tree-
sitter grammar and an `indent.scm` query file placed in `runtime/queries/
{language}/indents.scm`. The indentation for a line is calculated by traversing
the syntax tree from the lowest node at the beginning of the new line (see
[Indent queries](#indent-queries)). Each of these nodes contributes to the total
indent when it is captured by the query (in what way depends on the name of
the capture.

Note that it matters where these added indents begin. For example,
multiple indent level increases that start on the same line only increase
the total indent level by 1. See [Capture types](#capture-types).

By default, Helix uses the `hybrid` indentation heuristic. This means that
indent queries are not used to compute the expected absolute indentation of a
line but rather the expected difference in indentation between the new and an
already existing line. This difference is then added to the actual indentation
of the already existing line. Since this makes errors in the indent queries
harder to find, it is recommended to disable it when testing via
`:set indent-heuristic tree-sitter`. The rest of this guide assumes that
the `tree-sitter` heuristic is used.

## Indent queries

When Helix is inserting a new line through `o`, `O`, or `<ret>`, to determine
the indent level for the new line, the query in `indents.scm` is run on the
document. The starting position of the query is the end of the line above where
a new line will be inserted.

For `o`, the inserted line is the line below the cursor, so that starting
position of the query is the end of the current line.

```rust
fn need_hero(some_hero: Hero, life: Life) -> {
    matches!(some_hero, Hero { // ←─────────────────╮
        strong: true,//←╮  ↑  ↑                     │
        fast: true,  // │  │  ╰── query start       │
        sure: true,  // │  ╰───── cursor            ├─ traversal 
        soon: true,  // ╰──────── new line inserted │  start node
    }) &&            //                             │
//  ↑                                               │
//  ╰───────────────────────────────────────────────╯
    some_hero > life
}
```

For `O`, the newly inserted line is the *current* line, so the starting position
of the query is the end of the line above the cursor.

```rust
fn need_hero(some_hero: Hero, life: Life) -> { // ←─╮
    matches!(some_hero, Hero { // ←╮          ↑     │
        strong: true,//    ↑   ╭───╯          │     │
        fast: true,  //    │   │ query start ─╯     │
        sure: true,  //    ╰───┼ cursor             ├─ traversal
        soon: true,  //        ╰ new line inserted  │  start node
    }) &&            //                             │
    some_hero > life //                             │
} // ←──────────────────────────────────────────────╯
```

From this starting node, the syntax tree is traversed up until the root node.
Each indent capture is collected along the way, and then combined according to
their [capture types](#capture-types) and [scopes](#scopes) to a final indent
level for the line.

### Capture types

- `@indent` (default scope `tail`):
  Increase the indent level by 1. Multiple occurrences in the same line *do not*
  stack. If there is at least one `@indent` and one `@outdent` capture on the
  same line, the indent level isn't changed at all.
- `@outdent` (default scope `all`):
  Decrease the indent level by 1. The same rules as for `@indent` apply.
- `@indent.always` (default scope `tail`):
  Increase the indent level by 1. Multiple occurrences on the same line *do*
  stack. The final indent level is `@indent.always``@outdent.always`. If
  an `@indent` and an `@indent.always` are on the same line, the `@indent` is
  ignored.
- `@outdent.always` (default scope `all`):
  Decrease the indent level by 1. The same rules as for `@indent.always` apply.
- `@align` (default scope `all`):
  Align everything inside this node to some anchor. The anchor is given
  by the start of the node captured by `@anchor` in the same pattern.
  Every pattern with an `@align` should contain exactly one `@anchor`.
  Indent (and outdent) for nodes below (in terms of their starting line)
  the `@align` node is added to the indentation required for alignment.
- `@extend`:
  Extend the range of this node to the end of the line and to lines that are
  indented more than the line that this node starts on. This is useful for
  languages like Python, where for the purpose of indentation some nodes (like
  functions or classes) should also contain indented lines that follow them.
- `@extend.prevent-once`:
  Prevents the first extension of an ancestor of this node. For example, in Python
  a return expression always ends the block that it is in. Note that this only
  stops the extension of the next `@extend` capture. If multiple ancestors are
  captured, only the extension of the innermost one is prevented. All other
  ancestors are unaffected (regardless of whether the innermost ancestor would
  actually have been extended).

#### `@indent` / `@outdent`

Consider this example:

```rust
fn shout(things: Vec<Thing>) {
    //                       ↑
    //                       ├───────────────────────╮ indent level
    //                    @indent                    ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄
    //                                               │
    let it_all = |out| { things.filter(|thing| { //  │      1
    //                 ↑                       ↑     │
    //                 ├───────────────────────┼─────┼┄┄┄┄┄┄┄┄┄┄┄┄┄┄
    //              @indent                 @indent  │
    //                                               │      2
        thing.can_do_with(out) //                    │
    })}; //                                          ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄
  //↑↑↑                                              │      1
} //╰┼┴──────────────────────────────────────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄
// 3x @outdent
```

```scm
((block) @indent)
["}" ")"] @outdent
```

Note how on the second line, we have two blocks begin on the same line. In this
case, since both captures occur on the same line, they are combined and only
result in a net increase of 1. Also note that the closing `}`s are part of the
`@indent` captures, but the 3 `@outdent`s also combine into 1 and result in that
line losing one indent level.

#### `@extend` / `@extend.prevent-once`

For an example of where `@extend` can be useful, consider Python, which is
whitespace-sensitive.

```scm
]
  (parenthesized_expression)
  (function_definition)
  (class_definition)
] @indent

```

```python
class Hero:
    def __init__(self, strong, fast, sure, soon):#  ←─╮
        self.is_strong = strong #                     │
        self.is_fast = fast     # ╭─── query start    │
        self.is_sure = sure     # │ ╭─ cursor         │
        self.is_soon = soon     # │ │                 │
        #     ↑            ↑      │ │                 │
        #     │            ╰──────╯ │                 │
        #     ╰─────────────────────╯                 │
        #                                             ├─ traversal
    def need_hero(self, life):         #              │  start node
        return (                       #              │
            self.is_strong             #              │
            and self.is_fast           #              │
            and self.is_sure           #              │
            and self.is_soon           #              │
            and self > life            #              │
        ) # ←─────────────────────────────────────────╯
```

Without braces to catch the scope of the function, the smallest descendant of
the cursor on a line feed ends up being the entire inside of the class. Because
of this, it will miss the entire function node and its indent capture, leading
to an indent level one too small.

To address this case, `@extend` tells helix to "extend" the captured node's span
to the line feed and every consecutive line that has a greater indent level than
the line of the node.

```scm
(parenthesized_expression) @indent

]
  (function_definition)
  (class_definition)
] @indent @extend

```

```python
class Hero:
    def __init__(self, strong, fast, sure, soon):#  ←─╮
        self.is_strong = strong #                     │
        self.is_fast = fast     # ╭─── query start    ├─ traversal
        self.is_sure = sure     # │ ╭─ cursor         │  start node
        self.is_soon = soon     # │ │ ←───────────────╯
        #     ↑            ↑      │ │                 
        #     │            ╰──────╯ │
        #     ╰─────────────────────╯
    def need_hero(self, life):
        return (
            self.is_strong
            and self.is_fast
            and self.is_sure
            and self.is_soon
            and self > life
        )
```

Furthermore, there are some cases where extending to everything with a greater
indent level may not be desirable. Consider the `need_hero` function above. If
our cursor is on the last line of the returned expression.

```python
class Hero:
    def __init__(self, strong, fast, sure, soon):
        self.is_strong = strong
        self.is_fast = fast
        self.is_sure = sure
        self.is_soon = soon

    def need_hero(self, life):
        return (
            self.is_strong
            and self.is_fast
            and self.is_sure
            and self.is_soon
            and self > life
        ) # ←─── cursor
    #←────────── where cursor should go on new line
```

In Python, the are a few tokens that will always end a scope, such as a return
statement. Since the scope ends, so should the indent level. But because the
function span is extended to every line with a greater indent level, a new line
would just continue on the same level. And an `@outdent` would not help us here
either, since it would cause everything in the parentheses to become outdented
as well.

To help, we need to signal an end to the extension. We can do this with
`@extend.prevent-once`.

```scm
(parenthesized_expression) @indent

]
  (function_definition)
  (class_definition)
] @indent @extend

(return_statement) @extend.prevent-once
```

#### `@indent.always` / `@outdent.always`

As mentioned before, normally if there is more than one `@indent` or `@outdent`
capture on the same line, they are combined.

Sometimes, there are cases when you may want to ensure that every indent capture
is additive, regardless of how many occur on the same line. Consider this
example in YAML.

```yaml
  - foo: bar
# ↑ ↑
# │ ╰─────────────── start of map
# ╰───────────────── start of list element
    baz: quux # ←─── cursor
    # ←───────────── where the cursor should go on a new line
    garply: waldo
  - quux:
      bar: baz
    xyzzy: thud
    fred: plugh
```

In YAML, you often have lists of maps. In these cases, the syntax is such that
the list element and the map both start on the same line. But we really do want
to start an indentation for each of these so that subsequent keys in the map
hang over the list and align properly. This is where `@indent.always` helps.

```scm
((block_sequence_item) @item @indent.always @extend
  (#not-one-line? @item))

((block_mapping_pair
    key: (_) @key
    value: (_) @val
    (#not-same-line? @key @val)
  ) @indent.always @extend
)
```

## Predicates

In some cases, an S-expression cannot express exactly what pattern should be matched.
For that, tree-sitter allows for predicates to appear anywhere within a pattern,
similar to how `#set!` declarations work:

```scm
(some_kind
  (child_kind) @indent
  (#predicate? arg1 arg2 ...)
)
```

The number of arguments depends on the predicate that's used.
Each argument is either a capture (`@name`) or a string (`"some string"`).
The following predicates are supported by tree-sitter:

- `#eq?`/`#not-eq?`:
The first argument (a capture) must/must not be equal to the second argument
(a capture or a string).

- `#match?`/`#not-match?`:
The first argument (a capture) must/must not match the regex given in the
second argument (a string).

- `#any-of?`/`#not-any-of?`:
The first argument (a capture) must/must not be one of the other arguments
(strings).

Additionally, we support some custom predicates for indent queries:

- `#not-kind-eq?`:
The kind of the first argument (a capture) must not be equal to the second
argument (a string).

- `#same-line?`/`#not-same-line?`:
The captures given by the 2 arguments must/must not start on the same line.

- `#one-line?`/`#not-one-line?`:
The captures given by the fist argument must/must span a total of one line.

### Scopes

Added indents don't always apply to the whole node. For example, in most
cases when a node should be indented, we actually only want everything
except for its first line to be indented. For this, there are several
scopes (more scopes may be added in the future if required):

- `tail`:
This scope applies to everything except for the first line of the
captured node.
- `all`:
This scope applies to the whole captured node. This is only different from
`tail` when the captured node is the first node on its line.

For example, imagine we have the following function

```rust
fn aha() { // ←─────────────────────────────────────╮
  let take = "on me";  // ←──────────────╮  scope:  │
  let take = "me on";             //     ├─ "tail"  ├─ (block) @indent
  let ill = be_gone_days(1 || 2); //     │          │
} // ←───────────────────────────────────┴──────────┴─ "}" @outdent
                                         //                scope: "all"
```

We can write the following query with the `#set!` declaration:

  ```scm
  ((block) @indent
   (#set! "scope" "tail"))
  ("}" @outdent
   (#set! "scope" "all"))
  ```

As we can see, the "tail" scope covers the node, except for the first line.
Everything up to and including the closing brace gets an indent level of 1.
Then, on the closing brace, we encounter an outdent with a scope of "all", which
means the first line is included, and the indent level is cancelled out on this
line. (Note these scopes are the defaults for `@indent` and `@outdent`—they are
written explicitly for demonstration.)