aboutsummaryrefslogtreecommitdiff
path: root/helix-core/src/syntax.rs
Commit message (Collapse)AuthorAge
* Softwrapping improvements (#5893)Clément Delafargue2023-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * use max_line_width + 1 during softwrap to account for newline char Helix softwrap implementation always wraps lines so that the newline character doesn't get cut off so he line wraps one chars earlier then in other editors. This is necessary, because newline chars are always selecatble in helix and must never be hidden. However That means that `max_line_width` currently wraps one char earlier than expected. The typical definition of line width does not include the newline character and other helix commands like `:reflow` also don't count the newline character here. This commit makes softwrap use `max_line_width + 1` instead of `max_line_width` to correct the impedance missmatch. * fix typos Co-authored-by: Jonathan Lebon <jonathan@jlebon.com> * Add text-width to config.toml * text-width: update setting documentation * rename leftover config item * remove leftover max-line-length occurrences * Make `text-width` optional in editor config When it was only used for `:reflow` it made sense to have a default value set to `80`, but now that soft-wrapping uses this setting, keeping a default set to `80` would make soft-wrapping behave more aggressively. * Allow softwrapping to ignore `text-width` Softwrapping wraps by default to the viewport width or a configured `text-width` (whichever's smaller). In some cases we only want to set `text-width` to use for hard-wrapping and let longer lines flow if they have enough space. This setting allows that. * Revert "Make `text-width` optional in editor config" This reverts commit b247d526d69adf41434b6fd9c4983369c785aa22. * soft-wrap: allow per-language overrides * Update book/src/configuration.md Co-authored-by: Pascal Kuthe <pascal.kuthe@semimod.de> * Update book/src/languages.md Co-authored-by: Pascal Kuthe <pascal.kuthe@semimod.de> * Update book/src/configuration.md Co-authored-by: Pascal Kuthe <pascal.kuthe@semimod.de> --------- Co-authored-by: Pascal Kuthe <pascal.kuthe@semimod.de> Co-authored-by: Jonathan Lebon <jonathan@jlebon.com> Co-authored-by: Alex Boehm <alexb@ozrunways.com> Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* Fix new clippy lints (#5892)Pascal Kuthe2023-02-09
|
* Fix initial highlight layer sort order (#5196)Michael Davis2023-02-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of this change is to remove the mutable self borrow on `HighlightIterLayer::sort_key` so that we can sort layers with the correct ordering using the `Vec::sort` function family. `HighlightIterLayer::sort_key` needs `&mut self` since it calls `Peekable::peek` which needs `&mut self`. `Vec::sort` functions only give immutable borrows of the elements to ensure the correctness of the sort. We could instead approach this by creating an eager Peekable and using that instead of `std::iter::Peekable` to wrap `QueryCaptures`: ```rust struct EagerPeekable<I: Iterator> { iter: I, peeked: Option<I::Item>, } impl<I: Iterator> EagerPeekable<I> { fn new(mut iter: I) -> Self { let peeked = iter.next(); Self { iter, peeked } } fn peek(&self) -> Option<&I::Item> { self.peeked.as_ref() } } impl<I: Iterator> Iterator for EagerPeekable<I> { type Item = I::Item; fn next(&mut self) -> Option<Self::Item> { std::mem::replace(&mut self.peeked, self.iter.next()) } } ``` This would be a cleaner approach (notice how `EagerPeekable::peek` takes `&self` rather than `&mut self`), however this doesn't work in practice because the Items emitted by the `tree_sitter::QueryCaptures` Iterator must be consumed before the next Item is returned. `Iterator::next` on `tree_sitter::QueryCaptures` modifies the `QueryMatch` returned by the last call of `next`. This behavior is not currently reflected in the lifetimes/structure of `QueryCaptures`. This fixes an issue with layers being out of order when using combined injections since the old code only checked the first range in the layer. Layers being out of order could cause missing highlights for combined-injections content.
* allow specifying environment for language servers in language.toml (#4004)TotalKrill2022-12-09
| | | | | Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com> Co-authored-by: Stephen Wakely <fungus.humungus@gmail.com> Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* significantly improve treesitter performance while editing large files (#4716)Pascal Kuthe2022-11-22
| | | | | | | | | | | * significantly improve treesitter performance while editing large files * Apply stylistic suggestions from code review Co-authored-by: Michael Davis <mcarsondavis@gmail.com> * use PartialEq and Hash instead of a freestanding function Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
* Bump TREE_SITTER_MATCH_LIMIT to 256 (#4830)Michael Davis2022-11-21
| | | | | The limit of 64 breaks some highlighting in Erlang files with complicated record definitions. Bumping to 256 seems to work on all files I have seen.
* Use TreeCursor to pretty-print :tree-sitter-subtree (#4606)Michael Davis2022-11-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current `:tree-sitter-subtree` has a bug for field-names when the field name belongs to an unnamed child node. Take this ruby example: def self.method_name true end The subtree given by tree-sitter-cli is: (singleton_method [2, 0] - [4, 3] object: (self [2, 4] - [2, 8]) name: (identifier [2, 9] - [2, 20]) body: (body_statement [3, 2] - [3, 6] (true [3, 2] - [3, 6]))) But the `:tree-sitter-subtree` output was (singleton_method object: (self) body: (identifier) (body_statement (true))) The `singleton_method` rule defines the `name` and `body` fields in an unnamed helper rule `_method_rest` and the old implementation of `pretty_print_tree_impl` would pass the `field_name` down from the named `singleton_method` node. To fix it we switch to the [TreeCursor] API which is recommended by the tree-sitter docs for traversing the tree. `TreeCursor::field_name` accurately determines the field name for the current cursor position even when the node is unnamed. [TreeCursor]: https://docs.rs/tree-sitter/0.20.9/tree_sitter/struct.TreeCursor.html
* improve performance of tree sitter query captures (for text object motions ↵Pascal Kuthe2022-11-11
| | | | | | | | | | | | | | | | in particular) (#4707) * add tree sitter match limit to avoid slowdowns for larger files Affects all tree sitter queries and should speedup both syntax highlighting and text object queries. This has been shown to fix significant slowdowns with textobjects for rust files as small as 3k loc. * Apply suggestions from code review Co-authored-by: Blaž Hrastnik <blaz@mxxn.io> Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* Resolve a bunch of upcoming clippy lintsBlaž Hrastnik2022-11-04
|
* Change syntax for suffix file-types configurations (#4414)Michael Davis2022-10-22
| | | | | | | | | | | | | | | | | The change in d801a6693c3d475b3942f705d3ef48d7966bdf65 to search for suffixes in `file-types` is too permissive: files like the tutor or `*.txt` files are now mistakenly interpreted as R or perl, respectively. This change changes the syntax for specifying a file-types entry that matches by suffix: ```toml file-types = [{ suffix = ".git/config" }] ``` And changes the file-type detection to first search for any non-suffix patterns and then search for suffixes only with the file-types entries marked explicitly as suffixes.
* syntax: Don't force lower-case for filenames (#4346)Christian Speich2022-10-21
| | | | | | | | | | | Just like for grammars we currently force a lower-case of the name for some actions (like filesystem lookup). To make this consistent and less surprising for users, we remove this lower-casing here. Note: it is still the preferred way to name both language and grammar in lower-case Signed-off-by: Christian Speich <cspeich@emlix.com>
* Allow using path suffixes to associate language file-types (#2455)midnightexigent2022-10-20
| | | | | | | | | | | | | | | | | | | | | | | | * feat(syntax): add strategy to associate file to language through pattern File path will match if it ends with any of the file types provided in the config. Also used this feature to add support for the .git/config and .ssh/config files * Add /etc/ssh/ssh_config to languages.toml * cargo xtask docgen * Update languages.md * Update languages.md * Update book/src/languages.md Co-authored-by: Ivan Tham <pickfire@riseup.net> * Update book/src/languages.md Co-authored-by: Ivan Tham <pickfire@riseup.net> Co-authored-by: Ivan Tham <pickfire@riseup.net>
* Merge pull request #2267 from dead10ck/fix-write-failBlaž Hrastnik2022-10-20
|\ | | | | Write path fixes
| * document should save even if formatter failsSkyler Hawthorne2022-10-19
| |
* | Pretty print `tree-sitter-subtree` expression (#4295)Fisher Darling2022-10-19
|/
* Log failures to load tree-sitter parsers as error (#4315)Michael Davis2022-10-16
| | | | | | | Info logs don't show up in the log file by default, but this line should: failures to load tree-sitter parser objects are useful errors. A parser might fail to load it is misconfigured (https://github.com/helix-editor/helix/pull/4303#discussion_r996448543) or if the file does not exist.
* do not reparse unmodified injections (#4146)Pascal Kuthe2022-10-11
|
* fix: map_err()? instead of unwrap (#3826)Alexander Brevig2022-09-13
|
* Add query-check xtaskMichael Davis2022-08-31
|
* tree-sitter: Prevent panic on loading queriesMichael Davis2022-08-31
|
* tree-sitter: Refactor lazy query loadingMichael Davis2022-08-31
| | | | | | The code for loading queries can be shared between indent and textobjects queries. In both cases we want to kick an error message out to the logs.
* Fix nondeterministic highlighting (#3275)A-Walrus2022-08-05
| | | | | | | | | | | | | | | | | * Fix nondeterministic highlighting This is done by prefering matches in the begining, ie for `keyword.function`, `keyword` is a better match than `function`. * Use all positions and not just leftmost Fixes possible edgecase with something like `function.method.builtin` and the queries `function.builtin` and `function.method` * Switch to bitmask for slightly better performance * Make matches from the start of string Also change comments to match new behaviour
* Change default formatter for any language (#2942)PiergiorgioZagaria2022-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Change default formatter for any language * Fix clippy error * Close stdin for Stdio formatters * Better indentation and pattern matching * Return Result<Option<...>> for fn format instead of Option * Remove unwrap for stdin * Handle FormatterErrors instead of Result<Option<...>> * Use Transaction instead of LspFormatting * Use Transaction directly in Document::format * Perform stdin type formatting asynchronously * Rename formatter.type values to kebab-case * Debug format for displaying io::ErrorKind (msrv fix) * Solve conflict? * Use only stdio type formatters * Remove FormatterType enum * Remove old comment * Check if the formatter exited correctly * Add formatter configuration to the book * Avoid allocations when writing to stdin and formatting errors * Remove unused import Co-authored-by: Gokul Soumya <gokulps15@gmail.com>
* Exclude only named children without injection.include-children (#3129)Matthias Deiml2022-08-03
| | | | | * Exclude only named children without injection.include-children * Add injection.include-unnamed-children parameter
* Replace '; inherits <language>' in treesitter queries with <language> ↵Philipp Mildenberger2022-07-22
| | | | | queries instead of appending them (#2470) Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* Refactor textobject node capture (#2741)Gokul Soumya2022-06-11
|
* Passing extra formatting options to LSPs (#2635)farwyler2022-06-05
| | | | | | | | | | | | | * allows passing extra formatting options to LSPs - adds optional field 'format' to [[language]] sections in 'languages.toml' - passes specified options the LSPs via FormattingOptions * cleaner conversion of formatting properties * move formatting options inside lsp::Client * cleans up formatting properties merge
* Include macro attributes to impls, structs, enums, functions etc. ↵Andrey Tkachenko2022-05-20
| | | | textobjects (#2494)
* configurable lsp request timeout (#2405)EmmChriss2022-05-11
|
* feat(languages): git-ignore and git-attributes (#2397)Matthew Toohey2022-05-05
|
* add reflow command (#2128)Vince Mutolo2022-05-02
| | | | | | | | | | | | | | | | | | | | | | | * add reflow command Users need to be able to hard-wrap text for many applications, including comments in code, git commit messages, plaintext documentation, etc. It often falls to the user to manually insert line breaks where appropriate in order to hard-wrap text. This commit introduces the "reflow" command (both in the TUI and core library) to automatically hard-wrap selected text to a given number of characters (defined by Unicode "extended grapheme clusters"). It handles lines with a repeated prefix, such as comments ("//") and indentation. * reflow: consider newlines to be word separators * replace custom reflow impl with textwrap crate * Sync reflow command docs with book * reflow: add default max_line_len language setting Co-authored-by: Vince Mutolo <vince@mutolo.org>
* fix typos (#2304)chunghha2022-04-27
|
* Add rulers option (#2060)Thomas2022-04-20
| | | | | | | * Add color_column option * Rename to ruler Co-authored-by: DeviousStoat <devious@stoat.com>
* Fix Golang textobject queries (#2153)Michael Davis2022-04-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * log textobject query construction errors The current behavior is that invalid queries are discarded silently which makes it difficult to debug invalid textobjects (either invalid syntax or an update may have come through that changed the valid set of nodes). * fix golang textobject query `method_spec_list` used to be a named node but was removed (I think for Helix, it was when updated to pull in the support for generics). Instead of a named node for the list of method specs we have a bunch of `method_spec` children nodes now. We can match on the set of them with a `+` wildcard. Example go for this query: type Shape interface { area() float64 perimeter() float64 } Which is parsed as: (source_file (type_declaration (type_spec name: (type_identifier) type: (interface_type (method_spec name: (field_identifier) parameters: (parameter_list) result: (type_identifier)) (method_spec name: (field_identifier) parameters: (parameter_list) result: (type_identifier))))))
* Remove usage of format ident feature from tests (#2028)Michael Davis2022-04-08
|
* Add runtime language configuration (#1794) (#1866)Roland Kovacs2022-04-05
| | | | | | | | | | | | | | | | | | | | | * Add runtime language configuration (#1794) * Add set-language typable command to change the language of current buffer. * Add completer for available language options. * Update set-language to refresh language server as well * Add language id based config lookup on `syntax::Loader`. * Add `Document::set_language3` to set programming language based on language id. * Update `Editor::refresh_language_server` to try language detection only if language is not already set. * Remove language detection from Editor::refresh_language_server * Move document language detection to where the scratch buffer is saved. * Rename Document::set_language3 to Document::set_language_by_language_id. * Remove unnecessary clone in completers::language
* Fix an issue that caused an empty indentation query to be used instead of ↵Triton1712022-04-01
| | | | | using the fallback method of copying the indentation from the current line. (#1908) Co-authored-by: Triton171 <triton0171@gmail.com>
* Indentation rework (#1562)Triton1712022-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP: Rework indentation system * Add ComplexNode for context-aware indentation (including a proof of concept for assignment statements in rust) * Add switch statements to Go indents.toml (fixes the second half of issue #1523) Remove commented-out code * Migrate all existing indentation queries. Add more options to ComplexNode and use them to improve C/C++ indentation. * Add comments & replace Option<Vec<_>> with Vec<_> * Add more detailed documentation for tree-sitter indentation * Improve code style in indent.rs * Use tree-sitter queries for indentation instead of TOML config. Migrate existing indent queries. * Add documentation for the new indent queries. Change xtask docgen to look for indents.scm instead of indents.toml * Improve code style in indent.rs. Fix an issue with the rust indent query. * Move indentation test sources to separate files. Add `#not-kind-eq?`, `#same-line?` and `#not-same-line` custom predicates. Improve the rust and c indent queries. * Fix indent test. Improve rust indent queries. * Move indentation tests to integration test folder. * Improve code style in indent.rs. Reuse tree-sitter cursors for indentation queries. * Migrate HCL indent query * Replace custom loading in indent tests with a designated languages.toml * Update indent query file name for --health command. * Fix single-space formatting in indent queries. * Add explanation for unwrapping. Co-authored-by: Triton171 <triton0171@gmail.com>
* Fix typo in query parsing error message (#1856)Slin Lee2022-03-22
|
* Optimize rendering by using Ropey::byte_sliceBlaž Hrastnik2022-03-17
| | | | | | | This avoids costly conversions via byte_to_char (which are then reversed back into bytes internally in Ropey). Reduces time spent in slice/byte_to_char from ~24% to ~5%.
* Refactor :set to parse by deserializing values (#1799)Gokul Soumya2022-03-15
| | | | | * Refactor :set to parse by deserializing values * Implement serialize for idle_timeout config
* rename '--fetch/build-grammars' flags into '--grammar fetch/build'Michael Davis2022-03-10
| | | | | The old flags were a bit long. --grammar is also aliased to -g to make it even easier.
* migrate grammar fetching/building code into helix-loader crateMichael Davis2022-03-10
| | | | | | | | | This is a rather large refactor that moves most of the code for loading, fetching, and building grammars into a new helix-loader module. This works well with the [[grammars]] syntax for languages.toml defined earlier: we only have to depend on the types for GrammarConfiguration in helix-loader and can leave all the [[language]] entries for helix-core.
* add 'use-grammars' to languages.tomlMichael Davis2022-03-10
| | | | | | | | The vision with 'use-grammars' is to allow the long-requested feature of being able to declare your own set of grammars that you would like. A simple schema with only/except grammar names controls the list of grammars that is fetched and built. It does not (yet) control which grammars may be loaded at runtime if they already exist.
* ensure rust grammar is available in CIMichael Davis2022-03-10
|
* rename tree_sitter_library in LanguageConfig to 'grammar'Michael Davis2022-03-10
| | | | | | | | This is not strictly speaking necessary. tree_sitter_library was used by just one grammar: llvm-mir-yaml, which uses the yaml grammar. This will make the language more consistent, though. Each language can explicitly say that they use Some(grammar), defaulting when None to the grammar that has a grammar_id matching the language's language_id.
* migrate helix-syntax crate into helix-core and helix-termMichael Davis2022-03-10
| | | | | | | | | | | | helix-syntax mostly existed for the sake of the build task which checks and compiles the submodules. Since we won't be relying on that process anymore, it doesn't end up making much sense to have a very thin crate just for some functions that we could port to helix-core. The remaining build-related code is moved to helix-term which will be able to provide grammar builds through the --build-grammars CLI flag.
* add tree-sitter sources to languages.tomlMichael Davis2022-03-10
| | | | | | | | | | | | | | | | | | | | | | | | | Here we add syntax to the languages.toml languge [[grammar]] name = "<name>" source = { .. } Which can be used to specify a tree-sitter grammar separately of the language that defines it, and we make this distinction for two reasons: * In later commits, we will separate this code from helix-core and bring it to a new helix-loader crate. Using separate schemas for language and grammar configurations allows for a nice divide between the types needed to be declared in helix-loader and in helix-core/syntax * Two different languages may use the same grammar. This is currently the case with llvm-mir-yaml and yaml. We could accomplish a config that works for this with just `[[languages]]`, but it gets a bit dicey with languages depending on one another. If you enable llvm-mir-yaml and disable yaml, does helix still need to fetch and build tree-sitter-yaml? It could be a matter of interpretation.
* Add --health command for troubleshooting (#1669)Gokul Soumya2022-03-08
| | | | | | | | | | | | | | | * Move runtime file location definitions to core * Add basic --health command * Add language specific --health * Show summary for all langs with bare --health * Use TsFeature from xtask for --health * cargo fmt Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* Ensure non empty grouped nodes in textobject queriesGokul Soumya2022-03-01
|