You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: text/0000-symbol-name-mangling-v2.md
+41-6Lines changed: 41 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -543,8 +543,8 @@ Given these definitions, compression is defined as follows.
543
543
- Initialize the substitution dictionary to be empty.
544
544
- Traverse and modify the AST as follows:
545
545
- When encountering a substitutable node `N` there are two cases
546
-
1. If the substitution dictionary already contains an *equivalent* node, replace the current node`N` with a `<substitution>` that encodes the substitution index taken from the dictionary.
547
-
2. Else, continue traversing through the child nodes of the current node. After the child nodes have been traversed, and if the dictionary does not yet contain an *equivalent* node, then allocate the next unused substitution index and add it to the substitution dictionary with `N` as its key.
546
+
1. If the substitution dictionary already contains an *equivalent* node, replace the children of`N` with a `<substitution>` that encodes the substitution index taken from the dictionary.
547
+
2. Else, continue traversing through the child nodes of `N`. After the child nodes have been traversed and if the dictionary does not yet contain an *equivalent* node, then allocate the next unused substitution index and add it to the substitution dictionary with `N` as its key.
548
548
549
549
The following gives an example of substitution index assignment and node replacements for `foo::Bar::quux<foo::Bar>` (with `quux` being an inherent method of `foo::Bar`). `#n` designates that the substitution index `n` was assigned to the given node and `:= #n` designates that it is replaced with a `<substitution>`:
550
550
@@ -573,26 +573,61 @@ Some interesting things to note in this example:
573
573
574
574
- There are substitutable nodes that are not replaced, nor added to the dictionary. This falls out of the equivalence rule. The node marked with `#1` is equivalent to its three immediate ancestors, so no dictionary entries are generated for those.
575
575
576
-
- The `<type>` node marked with `:= #1` is replaced by `#1`, which is not a `<type>` but a (equivalent) `<path-prefix>`. This is OK and prescribed by the algorithm. The definition of equivalence ensures that there is only one valid way to construct a `<type>` node from a `<path-prefix>` node.
576
+
- The `<type>` node marked with `:= #1` is replaced by `#1`, which is not a `<type>` but an (equivalent) `<path-prefix>`. This is OK and prescribed by the algorithm. The definition of equivalence ensures that there is only one valid way to construct a `<type>` node from a `<path-prefix>` node.
577
577
578
578
579
+
## Decompression
580
+
581
+
Decompression works analogously to compression:
579
582
583
+
- Initialize the substitution dictionary to be empty.
584
+
- Traverse and modify the AST as follows:
585
+
- When encountering a substitutable node `N` there are two cases
586
+
1. If the node has a single `<substitution>` child, extract the substitution index from it and replace the node with the corresponding entry from the substitution dictionary.
587
+
2. Else, continue traversing the child nodes of the current node. After the child nodes have been traversed, and if the dictionary does not yet contain an *equivalent* node, then allocate the next unused substitution index and add it to the substitution dictionary with `N` as its key.
580
588
589
+
This is what the example from above looks like for decompression:
581
590
591
+
```
592
+
<symbol-name>
593
+
|
594
+
<absolute-path> #3
595
+
/ \
596
+
<path-prefix> #2 <generic-arguments>
597
+
/ \ |
598
+
<path-prefix> <identifier "quux"> <type> := #1
599
+
/ |
600
+
<type> <substitution #1>
601
+
|
602
+
<absolute-path>
603
+
|
604
+
<path-prefix> #1
605
+
/ \
606
+
<path-prefix> #0 <identifier "Bar">
607
+
|
608
+
<identifier "foo">
609
+
```
582
610
611
+
### A Note On Implementing Efficient Demangling
583
612
613
+
The mangling syntax is constructed in a way that allows for implementing an efficient demangler:
584
614
615
+
- Mangled names contain information in the same order as unmangled names are expected to contain it. Therefore, a demangler can directly generate its output while parsing the mangled form. There is no need to explicitly instantiate the AST in memory.
585
616
586
-
## Decompression
617
+
- The same is true for decompression. The demangler can keep a simple array that maps substitution indices to ranges in the already generated output. When it encounters a `<substitution>` in need of expansion, it can just look up corresponding range and do a simple `memcpy`.
587
618
619
+
Parsing, decompression, and demangling can thus be done in a single pass over the mangled name without the need to do dynamic allocation except for dictionary array.
588
620
589
-
### Note on Efficient Demangling
590
621
622
+
## Mapping Rust Language Entities to Symbol Names
591
623
592
-
## Mapping Rust Items to Mangled Names
624
+
This RFC suggests the following mapping of Rust entities to mangled names:
593
625
626
+
- Free standing named functions and types shall be represented by an `<absolute-path>` production.
594
627
628
+
- Absolute paths should be rooted at the inner-most entity that can act as a path root. Roots can be crate-ids, types (for entities with an inherent impl in their path), and trait impls (for entities with trait impls in their path).
595
629
630
+
- The compiler is free to choose disambiguation indices for identifiers and trait impls that need disambiguation. The disambiguation index `0` is represented by omitting the `<disambiguator>` production (which should be the common case). Disambiguation indices do not need to be densely packed. In particular the compiler can use arbitrary hashes to disambiguate items (which is useful for supporting specializing trait impls).
0 commit comments