Support LSP 3.17 `positionEncoding` negotiation by ahogappa · Pull Request #442 · ruby/typeprof

ahogappa · 2026-04-25T13:59:30Z

Summary

Implements LSP 3.17 positionEncoding negotiation. Today TypeProf hardcodes UTF-16LE columns and omits positionEncoding from its capabilities response, so modern clients that prefer UTF-8 (Helix, Zed, Neovim) all fall back to UTF-16.

This PR also exposes the encoding as Service.new(position_encoding:) for embedders.
The motivating use case is ruby-minify, which cross-references TypeProf nodes against Prism re-parses to drive variable/method renaming.
Prism's native columns are byte-based — equivalent to LSP utf-8 — so Service.new(position_encoding: Encoding::UTF_8) makes the two coordinate systems line up.
Without this option, ruby-minify reaches into TypeProf via node.instance_variable_get(:@raw_node).location to bypass UTF-16 columns.

Behavior change

None by default. Clients that don't send general.positionEncodings, and embedders that don't pass position_encoding, still get UTF-16LE.

Implementation note

The server picks the first mutually-supported encoding from the client's preference-ordered list (utf-8 / utf-16 / utf-32), falling back to utf-16 per spec.

FileContext#column_offsets_for returns columns in the configured encoding. For UTF-8 it uses Prism's native start_column / end_column (= byte offsets) — note that Prism's code_units_cache(Encoding::UTF_8) reports code points rather than bytes, which would violate the LSP spec.

Verification

End-to-end against Helix, which proposes general.positionEncodings: ["utf-8", "utf-32", "utf-16"]. Source — line 4 references undefined あfoo, where あ is 3 bytes in UTF-8 / 1 code unit in UTF-16:

def foo(n)
  n
end

あfoo(1, 2)

Helix `helix_lsp::transport` log — `InitializeResult`

Master (ruby/typeprof 0.31.1):

typeprof <- {"id":0,"result":{"capabilities":{"textDocumentSync":{...},"hoverProvider":true,...,"referencesProvider":true},"serverInfo":{"name":"typeprof","version":"0.31.1"}},"jsonrpc":"2.0"}

This PR:

typeprof <- {"id":0,"result":{"capabilities":{"positionEncoding":"utf-8","textDocumentSync":{...},"hoverProvider":true,...,"referencesProvider":true},"serverInfo":{"name":"typeprof","version":"0.31.1"}},"jsonrpc":"2.0"}

The only difference is the new "positionEncoding":"utf-8" field, which signals that subsequent column values use UTF-8 byte offsets.

Unit tests in test/core/position_encoding_test.rb (UTF-8 / UTF-16 / UTF-32 columns).
LSP tests in test/lsp/lsp_test.rb (negotiation matrix).
End-to-end against Helix: confirmed positionEncoding: "utf-8" response and correct diagnostic range on a non-ASCII identifier.

Spec: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocuments

Adds `position_encoding:` option to `TypeProf::Core::Service.new`. FileContext stores the encoding and computes Prism::Location columns accordingly. Default is `Encoding::UTF_16LE` (preserves existing behavior). UTF-8 uses Prism's native `start_column`/`end_column` (= byte offsets), because Prism's `code_units_cache(Encoding::UTF_8)` reports code points rather than bytes — the latter is what the LSP 3.17 spec defines for `utf-8`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Implements `general.positionEncodings` negotiation per LSP 3.17 spec. The server picks the first encoding from the client's preference list that it supports (`utf-8` / `utf-16` / `utf-32`) and reports it back via `capabilities.positionEncoding`. Falls back to UTF-16 (mandatory per spec) if the client doesn't propose any supported encoding. The negotiated value flows into per-workspace Services through `core_options.merge(position_encoding: ...)`, so each Service computes column positions in the agreed encoding. See: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocuments Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ahogappa · 2026-04-25T14:31:54Z

+    attr_reader :path, :comments, :position_encoding
+    def initialize(path, position_encoding = nil, prism_source = nil, comments = nil)
      @path = path
+      @position_encoding = position_encoding || Encoding::UTF_16LE


FYI: given that Ruby's default source encoding is UTF-8, UTF-8 would arguably be a more natural default.
However, switching from UTF-16LE would be a breaking change for existing users, so this PR keeps UTF-16LE.

mame

Thanks, I would like to give it a try!

ahogappa and others added 2 commits April 25, 2026 22:14

ahogappa commented Apr 25, 2026

View reviewed changes

mame approved these changes Apr 28, 2026

View reviewed changes

mame merged commit 1aedffb into ruby:master Apr 28, 2026
6 checks passed

mame mentioned this pull request Apr 28, 2026

Lazily convert Prism/RBS locations to CodeRange objects #436

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support LSP 3.17 `positionEncoding` negotiation#442

Support LSP 3.17 `positionEncoding` negotiation#442
mame merged 2 commits intoruby:masterfrom
ahogappa:feature/position-encoding

ahogappa commented Apr 25, 2026

Uh oh!

ahogappa Apr 25, 2026

Uh oh!

mame left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahogappa commented Apr 25, 2026

Summary

Behavior change

Implementation note

Verification

Helix helix_lsp::transport log — InitializeResult

Uh oh!

ahogappa Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

mame left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Helix `helix_lsp::transport` log — `InitializeResult`