diff options
Diffstat (limited to 'book')
-rw-r--r-- | book/src/guides/adding_languages.md | 67 | ||||
-rw-r--r-- | book/src/languages.md | 106 |
2 files changed, 97 insertions, 76 deletions
diff --git a/book/src/guides/adding_languages.md b/book/src/guides/adding_languages.md index 0cd6c27b..5be7a264 100644 --- a/book/src/guides/adding_languages.md +++ b/book/src/guides/adding_languages.md @@ -2,40 +2,8 @@ ## Language configuration -To add a new language, you need to add a `language` entry to the -[`languages.toml`][languages.toml] found in the root of the repository; -this `languages.toml` file is included at compilation time, and is -distinct from the `languages.toml` file in the user's [configuration -directory](../configuration.md). - -```toml -[[language]] -name = "mylang" -scope = "scope.mylang" -injection-regex = "^mylang$" -file-types = ["mylang", "myl"] -comment-token = "#" -indent = { tab-width = 2, unit = " " } -language-server = { command = "mylang-lsp", args = ["--stdio"] } -``` - -These are the available keys and descriptions for the file. - -| Key | Description | -| ---- | ----------- | -| `name` | The name of the language | -| `scope` | A string like `source.js` that identifies the language. Currently, we strive to match the scope names used by popular TextMate grammars and by the Linguist library. Usually `source.<name>` or `text.<name>` in case of markup languages | -| `injection-regex` | regex pattern that will be tested against a language name in order to determine whether this language should be used for a potential [language injection][treesitter-language-injection] site. | -| `file-types` | The filetypes of the language, for example `["yml", "yaml"]`. Extensions and full file names are supported. | -| `shebangs` | The interpreters from the shebang line, for example `["sh", "bash"]` | -| `roots` | A set of marker files to look for when trying to find the workspace root. For example `Cargo.lock`, `yarn.lock` | -| `auto-format` | Whether to autoformat this language when saving | -| `diagnostic-severity` | Minimal severity of diagnostic for it to be displayed. (Allowed values: `Error`, `Warning`, `Info`, `Hint`) | -| `comment-token` | The token to use as a comment-token | -| `indent` | The indent to use. Has sub keys `tab-width` and `unit` | -| `language-server` | The Language Server to run. Has sub keys `command` and `args` | -| `config` | Language Server configuration | -| `grammar` | The tree-sitter grammar to use (defaults to the value of `name`) | +To add a new language, you need to add a `[[language]]` entry to the +`languages.toml` (see the [language configuration section]). When adding a new language or Language Server configuration for an existing language, run `cargo xtask docgen` to add the new configuration to the @@ -45,32 +13,12 @@ When adding a Language Server configuration, be sure to update the ## Grammar configuration -If a tree-sitter grammar is available for the language, add a new `grammar` +If a tree-sitter grammar is available for the language, add a new `[[grammar]]` entry to `languages.toml`. -```toml -[[grammar]] -name = "mylang" -source = { git = "https://github.com/example/mylang", rev = "a250c4582510ff34767ec3b7dcdd3c24e8c8aa68" } -``` - -Grammar configuration takes these keys: - -| Key | Description | -| --- | ----------- | -| `name` | The name of the tree-sitter grammar | -| `source` | The method of fetching the grammar - a table with a schema defined below | - -Where `source` is a table with either these keys when using a grammar from a -git repository: - -| Key | Description | -| --- | ----------- | -| `git` | A git remote URL from which the grammar should be cloned | -| `rev` | The revision (commit hash or tag) which should be fetched | -| `subpath` | A path within the grammar directory which should be built. Some grammar repositories host multiple grammars (for example `tree-sitter-typescript` and `tree-sitter-ocaml`) in subdirectories. This key is used to point `hx --grammar build` to the correct path for compilation. When omitted, the root of repository is used | - -Or a `path` key with an absolute path to a locally available grammar directory. +You may use the `source.path` key rather than `source.git` with an absolute path +to a locally available grammar for testing, but switch to `source.git` before +submitting a pull request. ## Queries @@ -91,8 +39,7 @@ the last matching query supersedes the ones before it. See - If a parser is segfaulting or you want to remove the parser, make sure to remove the compiled parser in `runtime/grammar/<name>.so` -[treesitter-language-injection]: https://tree-sitter.github.io/tree-sitter/syntax-highlighting#language-injection -[languages.toml]: https://github.com/helix-editor/helix/blob/master/languages.toml +[language configuration section]: ../languages.md [neovim-query-precedence]: https://github.com/helix-editor/helix/pull/1170#issuecomment-997294090 [install-lsp-wiki]: https://github.com/helix-editor/helix/wiki/How-to-install-the-default-language-servers [lang-support]: ../lang-support.md diff --git a/book/src/languages.md b/book/src/languages.md index 8c27785e..a9d5bea8 100644 --- a/book/src/languages.md +++ b/book/src/languages.md @@ -1,10 +1,17 @@ # Languages -Language-specific settings and settings for particular language servers can be configured in a `languages.toml` file placed in your [configuration directory](./configuration.md). Helix actually uses two `languages.toml` files, the [first one](https://github.com/helix-editor/helix/blob/master/languages.toml) is in the main helix repository; it contains the default settings for each language and is included in the helix binary at compile time. Users who want to see the available settings and options can either reference the helix repo's `languages.toml` file, or consult the table in the [adding languages](./guides/adding_languages.md) section. +Language-specific settings and settings for language servers are configured +in `languages.toml` files. -A local `languages.toml` can be created within a `.helix` directory. Its settings will be merged with both the global and default configs. +## `languages.toml` files -Changes made to the `languages.toml` file in a user's [configuration directory](./configuration.md) are merged with helix's defaults on start-up, such that a user's settings will take precedence over defaults in the event of a collision. For example, the default `languages.toml` sets rust's `auto-format` to `true`. If a user wants to disable auto-format, they can change the `languages.toml` in their [configuration directory](./configuration.md) to make the rust entry read like the example below; the new key/value pair `auto-format = false` will override the default when the two sets of settings are merged on start-up: +There are three possible `languages.toml` files. The first is compiled into +Helix and lives in the [Helix repository](https://github.com/helix-editor/helix/blob/master/languages.toml). +This provides the default configurations for languages and language servers. + +You may define a `languages.toml` in your [configuration directory](./configuration.md) +which overrides values from the built-in language configuration. For example +to disable auto-LSP-formatting in Rust: ```toml # in <config_dir>/helix/languages.toml @@ -14,9 +21,60 @@ name = "rust" auto-format = false ``` -## LSP formatting options +Language configuration may also be overridden local to a project by creating +a `languages.toml` file under a `.helix` directory. Its settings will be merged +with the language configuration in the configuration directory and the built-in +configuration. + +## Language configuration -Use `format` field to pass extra formatting options to [Document Formatting Requests](https://github.com/microsoft/language-server-protocol/blob/gh-pages/_specifications/specification-3-16.md#document-formatting-request--leftwards_arrow_with_hook). +Each language is configured by adding a `[[language]]` section to a +`languages.toml` file. For example: + +```toml +[[language]] +name = "mylang" +scope = "source.mylang" +injection-regex = "^mylang$" +file-types = ["mylang", "myl"] +comment-token = "#" +indent = { tab-width = 2, unit = " " } +language-server = { command = "mylang-lsp", args = ["--stdio"] } +``` + +These configuration keys are available: + +| Key | Description | +| ---- | ----------- | +| `name` | The name of the language | +| `scope` | A string like `source.js` that identifies the language. Currently, we strive to match the scope names used by popular TextMate grammars and by the Linguist library. Usually `source.<name>` or `text.<name>` in case of markup languages | +| `injection-regex` | regex pattern that will be tested against a language name in order to determine whether this language should be used for a potential [language injection][treesitter-language-injection] site. | +| `file-types` | The filetypes of the language, for example `["yml", "yaml"]`. Extensions and full file names are supported. | +| `shebangs` | The interpreters from the shebang line, for example `["sh", "bash"]` | +| `roots` | A set of marker files to look for when trying to find the workspace root. For example `Cargo.lock`, `yarn.lock` | +| `auto-format` | Whether to autoformat this language when saving | +| `diagnostic-severity` | Minimal severity of diagnostic for it to be displayed. (Allowed values: `Error`, `Warning`, `Info`, `Hint`) | +| `comment-token` | The token to use as a comment-token | +| `indent` | The indent to use. Has sub keys `tab-width` and `unit` | +| `language-server` | The Language Server to run. See the Language Server configuration section below. | +| `config` | Language Server configuration | +| `grammar` | The tree-sitter grammar to use (defaults to the value of `name`) | + +### Language Server configuration + +The `language-server` field takes the following keys: + +| Key | Description | +| --- | ----------- | +| `command` | The name of the language server binary to execute. Binaries must be in `$PATH` | +| `args` | A list of arguments to pass to the language server binary | +| `timeout` | The maximum time a request to the language server may take, in seconds. Defaults to `20` | +| `language-id` | The language name to pass to the language server. Some language servers support multiple languages and use this field to determine which one is being served in a buffer | + +The top-level `config` field is used to configure the LSP initialization options. A `format` +sub-table within `config` can be used to pass extra formatting options to +[Document Formatting Requests](https://github.com/microsoft/language-server-protocol/blob/gh-pages/_specifications/specification-3-16.md#document-formatting-request--leftwards_arrow_with_hook). +For example with typescript: ```toml [[language]] @@ -26,23 +84,37 @@ auto-format = true config = { format = { "semicolons" = "insert", "insertSpaceBeforeFunctionParenthesis" = true } } ``` -## Tree-sitter grammars +## Tree-sitter grammar configuration -Tree-sitter grammars can also be configured in `languages.toml`: +The source for a language's tree-sitter grammar is specified in a `[[grammar]]` +section in `languages.toml`. For example: ```toml -# in <config_dir>/helix/languages.toml - [[grammar]] -name = "rust" -source = { git = "https://github.com/tree-sitter/tree-sitter-rust", rev = "a250c4582510ff34767ec3b7dcdd3c24e8c8aa68" } - -[[grammar]] -name = "c" -source = { path = "/path/to/tree-sitter-c" } +name = "mylang" +source = { git = "https://github.com/example/mylang", rev = "a250c4582510ff34767ec3b7dcdd3c24e8c8aa68" } ``` -You may use a top-level `use-grammars` key to control which grammars are fetched and built. +Grammar configuration takes these keys: + +| Key | Description | +| --- | ----------- | +| `name` | The name of the tree-sitter grammar | +| `source` | The method of fetching the grammar - a table with a schema defined below | + +Where `source` is a table with either these keys when using a grammar from a +git repository: + +| Key | Description | +| --- | ----------- | +| `git` | A git remote URL from which the grammar should be cloned | +| `rev` | The revision (commit hash or tag) which should be fetched | +| `subpath` | A path within the grammar directory which should be built. Some grammar repositories host multiple grammars (for example `tree-sitter-typescript` and `tree-sitter-ocaml`) in subdirectories. This key is used to point `hx --grammar build` to the correct path for compilation. When omitted, the root of repository is used | + +### Choosing grammars + +You may use a top-level `use-grammars` key to control which grammars are +fetched and built when using `hx --grammar fetch` and `hx --grammar build`. ```toml # Note: this key must come **before** the [[language]] and [[grammar]] sections @@ -52,3 +124,5 @@ use-grammars = { except = [ "yaml", "json" ] } ``` When omitted, all grammars are fetched and built. + +[treesitter-language-injection]: https://tree-sitter.github.io/tree-sitter/syntax-highlighting#language-injection |