Skip to content

Commit 82ddbbf

Browse files
committed
Add doc about context and scope
1 parent 83b924f commit 82ddbbf

File tree

1 file changed

+230
-0
lines changed

1 file changed

+230
-0
lines changed

context-and-scope.md

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
Context and Scope
2+
---
3+
We currently have a very complicated way of booting the puppet runtime, and once booted
4+
we have a complex and much overloaded implementation that revolves around:
5+
6+
* Settings
7+
* Node
8+
* Environment
9+
* Scope
10+
* Compiler
11+
* Parser / Evaluator
12+
* "bindings"
13+
* "known_resource_types"
14+
* "loading"
15+
* "functions"
16+
17+
We have been discussing how we can clean up the implementation with the following goals:
18+
19+
* make it easier to understand what is going on
20+
* make it possible to specify real APIs that are smaller than "everything"
21+
* complex bootstrapping where things are done multiple times? (unclear sequencing)
22+
* reduce bloat (almost the same behavior implemented in multiple places)
23+
* decrease cohesion by increased use of composition and dependency injection
24+
* make implementation of new features possible:
25+
* real closures
26+
* pass arguments in a more sane way to functions
27+
* provide support for "private" resource types, classes and variables
28+
* name-spaced functions and possibly name-spaced types
29+
* restrict modules to only see what they specify in their dependencies
30+
31+
As you can see, there is a mix of internal goals; "reduce cost of maintenance", "increase speed
32+
of innovation", "reduce risk of introducing new bugs", "make it easier to understand and work with", and real wanted features.
33+
34+
## Idea 1 - Context
35+
36+
A Context has already been introduced in the code base to solve a particular problem. The idea
37+
is to formalize and extend the capabilities around the Context.
38+
39+
The idea is that a Context represents the configuration of the runtime for the purpose of
40+
servicing a particular request. A very large percentage of the code base needs this context,
41+
and it is unreasonable to pass it around in every call everywhere - especially when passing 3d
42+
party implementations and coming out on the other side without a reference to context.
43+
44+
Instead, there is always a reference to the current context in a thread local variable.
45+
46+
A Context refers to its parent context, and is grounded at the RootContext (or BootContext, SystemContext or some-such name). As the system progresses in processing a particular request
47+
it creates a more specialized context parented by the current context, configures it, sets it as the current context, and then continues with the processing of the request.
48+
49+
At the edge, this could mean that there is a RootContext <- HttpContext when the system
50+
is performing a request that came in over HTTP. If the request originated from the command line,
51+
there may be some other kind of context. As the request continues to be serviced, say a request for
52+
a catalog - the logic is aware that a configuration is needed to deal with parsing, evaluation etc.
53+
A yet more specialized context is created, set as current (etc.).
54+
55+
This creates a context stack where the top of the stack is always available via the thread local
56+
variable. Logic that needs to know if the request being processed is within a particular context
57+
can query for it by asking for "context of type". (This is better than blindly calling methods
58+
on the current context because it enables catching system configuration errors).
59+
60+
Say, the logic needs to find the Compiler:
61+
62+
compiler = context.get(Puppet::Context::CompilingContext).compiler
63+
64+
If there is no CompilingContext the runtime stops with a ConfigurationError.
65+
66+
This is the basic idea - there is more to say about the different kinds of contexts.
67+
68+
## Idea 2 - Scope
69+
70+
The current Puppet::Parser::Scope is a very overloaded implementation. Its major flaw is
71+
that the scope stack is internal to the scope and it changes over time. This makes it impossible
72+
to support a proper closure for lambdas (they may not be remembered and used later since the context
73+
they refer to may be/is lost).
74+
75+
The implementation itself is also quite convoluted with multiple ways of asking - "is set", "is
76+
it there", "if not where" etc. Scope is also to some degree responsible for the policy what should
77+
happen if "not there", or "elsewhere".
78+
79+
Scope internally has implementation of EphemeralScope that act more like a traditional scope
80+
implementation. The idea is to break all of this logic apart.
81+
82+
Scope should be a very simply construct. There are subclasses of scope used for various purposes.
83+
84+
* `NamedScope` - a scope that is a top level scope that can be found via its name as reference. Its
85+
content is visible to anyone that finds it.
86+
* `InnerScope` - a scope that is parented by a named scope, it contains private variables
87+
only visible
88+
to the interior of the construct that created the NamedScope. This scope can not be looked up
89+
externally, it is only visible while evaluating the logic in the named scope. When setting non
90+
private variables, they are set in the parent scope (the named scope).
91+
* `LocalScope` - a scope that can not be looked up externally, all variables are by definition
92+
private (unless we also want private variables to be invisible for inner/nested scopes; languages
93+
typically do not work that way).
94+
* `MatchScope` - a scope that refers to a MatchData - the result of the last regexp match. (See below
95+
for more information).
96+
97+
### Scope API
98+
99+
# name String, value Object, options :private, :final
100+
set_variable(name, value, options)
101+
102+
# key !String, value Object, options :private, :final
103+
set_data(key, value, options)
104+
105+
# is key found in this or parents
106+
exist?(key_or_name)
107+
108+
# is key found in this
109+
exist_locally?(key_or_name)
110+
111+
# get value associated with 'key_or_name' or nil if not found
112+
get(key_or_name)
113+
114+
# get entry (a ScopedObject) or nil if not found in this or parents
115+
get_entry(key_or_name)
116+
117+
# Enumerator over local keys if no block is given
118+
each_local_key &block
119+
120+
# Enumerator over local keys and parent keys if no block given
121+
each_key &block
122+
123+
# MatchData or nil
124+
current_match()
125+
126+
# Sets match data (the only mutable variable)
127+
current_match=(MatchData)
128+
129+
# the current context
130+
context()
131+
132+
# the Puppet Programming Language injector
133+
injector()
134+
135+
* MatchData variables are never included in the enumeration
136+
* String and Numeric keys are variable names, automatic coercion between numeric in string form
137+
and numeric
138+
* Handling of inherited scope remains to be designed (the inherited scope could be
139+
handled as a parent scope - i.e. the `NamedScope` if its private variables are not visible,
140+
and it's `InnerScope` if they should be visible (probably not)).
141+
142+
#### ScopedObject
143+
144+
private?()
145+
final?()
146+
value()
147+
key()
148+
149+
#### MatchData
150+
151+
The current implementation has a mechanism where a stack point into the ephemeral stack
152+
can be obtained and later used to reset the scope's internal scope stack. This is done
153+
to provide match data that is specific to certain nested structures (if, unless, case, selector),
154+
and to reset when the structure goes out of scope. The implementation has a problem in that
155+
a sequence of expression like
156+
157+
$a = 'x' =~ /.*/
158+
$b = 'x' =~ /.*/
159+
# ...
160+
161+
Will push a new match scope for each match (since there is no defined end of the match
162+
scope).
163+
164+
This implementation is because the logic is deep down in match, and does not know what to do
165+
except to push a new match scope on the internal stack (hoping that it is reset by someone else
166+
later). Naturally, this is one of the problems with supporting real closures.
167+
168+
What we want is that the next match in the same scope should override the match, not push
169+
a new scope.
170+
171+
We can implement this by making "match data" be the only mutable variable. In a NamedScope,
172+
the match data is set on the InnerScope. This guarantees that numeric match variables
173+
are invisible when doing an "external lookup". In all other scopes it is simply set.
174+
175+
This leaves one problem, the scopes created for if, unless, case, selector cannot be a
176+
`LocalScope` since if it sets variables, they should be set in the outer scope. The solution is
177+
simple, we just push a new `InnerContext` onto the scope stack at the start of evaluation of
178+
such an expression (in contrast to getting the now internal stack pointer and doing a reset).
179+
180+
181+
## Back to Contexts
182+
183+
All Contexts supports access to an Injector. This is the system injector which contains
184+
bindings for the runtime (as opposed to bindings for the Puppet Programming Language).
185+
186+
The Context implementation that support compilation (there may be an intermediate context
187+
that knows how to parse code, one that allows evaluation of code (sans catalog/compilation).
188+
(Subject to details when doing the implementation). The concepts that some implementation of
189+
Context needs to handle are:
190+
191+
* Loader - which loader to use if a specialized loader is not known (the global default loader).
192+
(More about loaders below).
193+
* (TBD) Finding/creating configuring loaders for modules etc.
194+
* NamedScope lookup
195+
* Reference to Compiler
196+
197+
## Loaders
198+
199+
If we want to support that modules only see what they have declared dependencies on the loading
200+
of code cannot simply use one and only one global loader (like it does now). Instead, each
201+
module should have its own resolved "module path" that together with higher level loaders (global,
202+
per environment, etc.) defines the visibility and loading scope.
203+
204+
This means that:
205+
206+
* whenever a "type" is needed it is requested from a loader
207+
* the loader caches what it has loaded
208+
* the visibility of what is loaded is composed
209+
* loading is constrained to look in a particular set of places
210+
211+
It also means that:
212+
213+
* We must know which loader loaded the code that is being evaluated
214+
215+
The last part is simple. When the loader loads something it uses a LoaderAdapter to decorate
216+
the root of what has been loaded with a reference to the loading loader. (For PuppetLogic in
217+
the "future parser" this is a Puppet::Pops::Model::Program, and all other expressions under it
218+
can navigate to this root node when the loader is required - the Program is usually only
219+
a few hops away, but could potentially be cached at lower levels in the tree in case excessive
220+
loading from a deeply nested expression is found to be a real world problem).
221+
222+
The loader domain relates to the book-keeping of "in which module is this logic" which is currently
223+
done with variables stored in the scope. This is also where "caller_module_name" is stored.
224+
225+
226+
## Further work
227+
228+
The relationship between Node, Class, Define and scope is also subject to entangled logic.
229+
It seems much clearer if they were real objects instead of a magic combination of a Resource
230+
and a Scope.

0 commit comments

Comments
 (0)