You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The scheme needs to be able to generate symbol names for the function containing the code of a closure and it needs to be able to refer to the type of a closure if it occurs as a type argument. As closures don't have a name, we need to generate one. The scheme proposes to use the namespace and disambiguation mechanisms already introduced above for this purpose. Closures get their own "namespace" (i.e. they are neither in the type nor the value namespace), and each closure has an empty name with a disambiguation index (like for macro hygiene) identifying them within their parent. The full name of a closure is then constructed like for any other named item:
@@ -180,6 +181,7 @@ mod foo {
180
181
181
182
In the above example we have two closures, the one assigned to `a` and the one assigned to `b`. The first one would get the local name `0C` and the second one the name `0Cs_`. The `0` signifies then length of their (empty) name. The `C` is the namespace tag, analogous to the `V` tag for the value namespace. The `s_` for the second closure is the disambiguation index (index `0` is, again, encoded by not appending a suffix). Their full names would then be `N15mycrate_4a3b56d3foo3barV0CE` and `N15mycrate_4a3b56d3foo3barV0Cs_E` respectively.
182
183
184
+
183
185
### Methods
184
186
185
187
Methods are nested within `impl` or `trait` items. As such it would be possible construct their symbol names as paths like `my_crate::foo::{{impl}}::some_method` where `{{impl}}` somehow identifies the the `impl` in question. Since `impl`s don't have names, we'd have to use an indexing scheme like the one used for closures (and indeed, this is what the compiler does internally). Adding in generic arguments to, this would lead to symbol names looking like `my_crate::foo::impl'17::<u32, char>::some_method`.
@@ -265,6 +267,7 @@ impl<T: Default> Foo<T> for Bar<T> {
265
267
266
268
Notice that both `MSG` statics have the path `<Bar as Foo>::foo::MSG` if you just leave off the type arguments. However, we also don't have any concrete types to substitute the arguments for. Therefore, we have to disambiguate the `impls`. Since trait specialization is an unstable feature of Rust and the details are in flux, this RFC does not try to provide a mangling based on the `where` clauses of the specialized `impls`. Instead it proposes a scheme that re-uses the introduced numeric disambiguator form already used for macro hygiene and closures. Thus, conflicting `impls` would be disambiguated via an implementation defined suffix, as in `<Bar as Foo>'1::foo::MSG` and `<Bar as Foo>'2::foo::MSG`. This encoding introduces minimal additional syntax and can be replaced with something more human-readable once the definition of trait specialization is final.
267
269
270
+
268
271
### Unicode Identifiers
269
272
270
273
Rust allows Unicode identifiers but our character set is restricted to ASCII alphanumerics, and `_`. In order to transcode the former to the latter, we use the same approach as Swift, which is: encode all non-ascii identifiers via [Punycode][punycode], a standardized and efficient encoding that keeps encoded strings in a rather human-readable format. So for example, the string
@@ -378,9 +381,7 @@ The reference-level explanation consists of three parts:
378
381
2. A specification of the compression scheme.
379
382
3. A mapping of Rust entities to the mangling syntax.
380
383
381
-
For implementing a demangler, only the first to sections are needed, that is, a
382
-
demangler only needs to understand syntax and compression of names, but it does
383
-
not have to care how the compiler generates mangled names.
384
+
For implementing a demangler, only the first two sections are of interest, that is, a demangler only needs to understand syntax and compression of names, but it does not have to care about how the compiler generates mangled names.
384
385
385
386
386
387
## Syntax Of Mangled Names
@@ -471,11 +472,11 @@ Mangled names conform to the following grammar:
471
472
"u" // Unadjusted
472
473
)
473
474
474
-
<disambiguator> = "s" [<hex-digit>] "_"
475
+
<disambiguator> = "s" [<base-62-digit>] "_"
475
476
476
477
<generic-arguments> = "I" {<type>} "E"
477
478
478
-
<substitution> = "S" [<hex-digit>] "_"
479
+
<substitution> = "S" [<base-62-digit>] "_"
479
480
480
481
// We use <path-prefix> here, so that we don't have to add a special rule for
481
482
// compression. In practice, only <identifier> is expected.
0 commit comments