From 2a01b7e002b3f413ab480204d4cb9d65cb5b2da8 Mon Sep 17 00:00:00 2001 From: Leo White Date: Thu, 10 Oct 2019 13:21:04 +0100 Subject: [PATCH 1/5] Add notes from previous meetings --- notes/meetings/2019-09-05/notes.txt | 122 ++++++++ notes/meetings/2019-09-16/notes.txt | 60 ++++ notes/meetings/2019-10-07/inferno-roadmap.txt | 127 ++++++++ notes/meetings/2019-10-07/low-road-notes.txt | 31 ++ notes/meetings/2019-10-07/notes.txt | 274 ++++++++++++++++++ notes/meetings/2019-10-07/participants.md | 40 +++ notes/meetings/2019-10-07/schedule.md | 26 ++ 7 files changed, 680 insertions(+) create mode 100644 notes/meetings/2019-09-05/notes.txt create mode 100644 notes/meetings/2019-09-16/notes.txt create mode 100644 notes/meetings/2019-10-07/inferno-roadmap.txt create mode 100644 notes/meetings/2019-10-07/low-road-notes.txt create mode 100644 notes/meetings/2019-10-07/notes.txt create mode 100644 notes/meetings/2019-10-07/participants.md create mode 100644 notes/meetings/2019-10-07/schedule.md diff --git a/notes/meetings/2019-09-05/notes.txt b/notes/meetings/2019-09-05/notes.txt new file mode 100644 index 0000000..4ce0317 --- /dev/null +++ b/notes/meetings/2019-09-05/notes.txt @@ -0,0 +1,122 @@ +Gabriel: my main proposal is, for the "rebuild the type-checker from +scratch" part, to create a repository on top of Inferno¹ that +implements a ML language (MidML, MoyenML?). We start from the nano-ML +example used in Inferno itself (as a typical example of Inferno client +program). We add important features from the OCaml type system, one +after the other, which requires adding new features to Inferno +itself -- as clear and general as possible to express the desired ML +type-inference features. This is useful to get a clean theory (of the +ML feature, of the inference support for it), looking at interactions +with other features, with a reference implementation that keeps +growing. And for some of those features we can publish papers. + +The notes below include at the end a list of features to which we +could think of during the discussion -- a decomposition of OCaml's +type system in smaller. + +¹ Inferno is François Pottier's type-inference library. I (Gabriel) +would recommend reading its implementation -- I did it from the +high-level layers to the low-level layers. It is hosted on François's +favorite open-source closed-development platform: +https://gitlab.inria.fr/fpottier/inferno . + +---- + +# Global approach + +We should do "incremental" and "from scratch" ("big bang") in +parallel, with communication between both. + + +Incrémental + +- Start from the TODO list and Thomas' and Leo's ideas + + Jacques: I'm less interested in that part. + + Gabriel: but we will ask you to review what people have done. + + Jacques: I was mostly speaking of the refactoring. Conceptual + changes (like adding scopes) are more interesting to me. + + Gabriel: I think sometimes conceptual changes emerge + from refactoring, when done carefully. + + +Big-bang + +- Use Inferno + MoyenML as an experimental vehicle to study the essence + of OCaml type-system features, layer by layer. + + How to bridge the gap between MoyenML-gone-big and OCaml? One + possibly, if required (it's not pretty), would be to replace the + lowest layers of Inferno (Union-Find, levels/generalization etc.) by + the OCaml type representation and its manipulation code + (atype/btype/ctype). + + (Didier: the interface of Inferno is nice, but in the implementation + I'm not convinced by the very-effectful approach of François. Maybe + we don't want that on the long term, it may not scale to more + advanced constraints.) + +Core language: what's an explicitly-typed representation that the +OCaml type-inference engine could produce as result? + + Jacques: about that, I was thinking of using CIC (Calculus of + Inductive Constructions), to represent the values of OCaml's + intermediate representation with dependent types. -- use Gallina as + intermediate representation. + + +## Feature map to add to Inferno + MidML + +### Term language + +- [data]: algebraic datatypes + + pattern-matching + +- type abbreviations + +- label overloading (constructors and fields) + (depends on [data]) + + Gabriel: http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt + +- scoped type variables + +- first-class polymorphism + +- GADTs + (depends on [data]) + +- structural rows + (objects, polymorphic variants) + +- checking .mli + (the inclusion check between type declarations) + +- value restriction? + + partial generalization + + relaxed value restriction (variance) + +### Error report (entry later added by DR) + +- minimal error report needed + +- explore other solutions + + Nathanael proposal: report unification errors in terms of typing + constraints from the soruce. + +### Propagation of source typing annotations (later added by DR) + +- Preprocessing step? + +- During constraint generation? + +- Other: new form of constraints? + +### Modules + +- modules +- first-class modules + diff --git a/notes/meetings/2019-09-16/notes.txt b/notes/meetings/2019-09-16/notes.txt new file mode 100644 index 0000000..41d7321 --- /dev/null +++ b/notes/meetings/2019-09-16/notes.txt @@ -0,0 +1,60 @@ +Discussion of the "high road" (Xavier's term): +* rebuilding something like OCaml (possibly simpler, cleaner) + from scratch, starting from Inferno. + +Discussion of the "low road": +* refactoring the existing type-checker +* problem: the reviewing load will be high (especially for Jacques) +* Jacques says it's ok to discuss the high-level ideas, + but painful/impossible to review the code itself +* Xavier wants discussion on high-level ideas, more than one the code + +Didier asks whether the two roads are supposed to meet eventually. + +Jacques says that Thomas Refis is the one who is currently working most +on the "low road" (the type-checker). E.g. Thomas has clarified the notion +of "scope", etc. Also, Leo White has done some work on the type-checking +of objects and classes (not yet reviewed and merged; it was discussed in +March 2019 but the requested follow-up work was not done). + +François asks whether it would be worthwhile to work on strengthening +the test suite. +* Xavier mentions fuzzing; + it could be very useful if the type-checker contains assertions +* Florian & Sébastien had an Outreachy intern (Oxana) + who worked on checking the coverage of the test suite. + It would be good to continue and finalize this work. + (Gabriel had an Outreachy intern (Ulugbek) working on quickchecking + the compiler by randomly generating well-typed programs. Could also + be of interest.) + +Didier asks whether the code should be commented (either inside the code, +or in a separate file containing explanations). Damien notes that some PRs +begin with a long explanation. But it is not visible in a cloned repository! + +Sébastien asks where we should invest our energy and expertise. (Who should +work on what?) E.g. Sébastien might be interested in working on refactoring +the type-checker. + +What are our forces? +* Gabriel Radanne +* Leo White? +* Thomas Refis??? (will be doing a PhD with Didier, hence *scientific work*) +* Sébastien could have some time after March 2020 + +Someone mentions the technology that allows rapidly rebuilding all opam packages so as to test a change. Do we master it? Florian knows about "opam-check" (Damien's tool); works, but not yet parallel. (It should be!) Armaël & Gabriel have worked on another approach (also on top of opam) which is parallel. Still experimental, to be finished. And what about the OCaml Labs tools? Damien notes that many opam packages have broken dependencies, which complicates things. (Would it be easy to automatically check that no dependencies are missing in a package description?) (Damien notes that it would be useful to know which modules are installed by which packages.) + +Didier asks to what extent it is permitted to change the type system (intentionally). E.g. we might wish to change the treatment of first-class polymorphism. (Didier has the "high road" in mind.) There seems to be +consensus that in the high road it is permitted & desirable to do so. + +Some discussion of -principal, which is needed to infer principal types in the presence of first-class polymorphism, but is costly because it imposes copying instead of sharing. + +Sébastien notes the need to organize another meeting, including developers outside Inria, and to keep everyone in the loop. Need to do so soon, and decide (during that meeting) who will be working on each road. + +Some discussion of the idea of introducing one or more explicitly-typed intermediate languages. (Integrate the work by Pierrick Couderc?) It or they should be easy to type-check; the code of the checker should be close to a declarative specification. Jacques would like this language to be formally defined in Coq. Armaël notes that one could even have a type-checker for it implemented in Coq. (Analogous to what the F* people did?) Jacques and Xavier would like to do a proof of type soundness (preferably for a low-level calculus). A model of OCaml types in Iris could be interesting. Xavier wants types to have meaning. (This is the "high high road".) + +TODO: + Send a proposal for an open meeting. + Scheduled together with a regular developers meeting? + (Include Pierrick + Gabriel.) + Propose a 2-day meeting. Starting late, finishing early. Research + dev. diff --git a/notes/meetings/2019-10-07/inferno-roadmap.txt b/notes/meetings/2019-10-07/inferno-roadmap.txt new file mode 100644 index 0000000..365b759 --- /dev/null +++ b/notes/meetings/2019-10-07/inferno-roadmap.txt @@ -0,0 +1,127 @@ +An Inferno roadmap (also known as "the high road"). +I will also commit a version of this to the Inferno repository. +-- FP + +### Universal quantification in constraints + +### Type annotations + +This includes "scoped type variables", that is, the ability (or necessity?) +to explicitly bind type variables. + +It should be possible to bind a type variable either existentially or universally. + +As a side product, we get the ability of comparing an inferred type +with a declared type. + +### Algebraic data types and pattern-matching + +### GADTs + +Need to reason under local equations between types. +Need to detect ambivalent types. +In the presence of modules and abstract types, + need a notion of "compatibility" (an approximation of equality). + +### Type abbreviations + +Toplevel or local? +Recursive? +The subject of Carine Morel's recent internship. + +### Principality, sharing + +Understand how type information is shared internally and is propagated. +Related to GADTs, first-class polymorphism, type abbreviations. + +### Propagation of type information (synchronization) + +Can it be viewed as a preprocessing step? +Can it be performed at the same time as constraint generation? +Do we need new forms of constraints? + (somehow allowing feedback from constraint resolution + into constraint generation) +What is the role of type annotations in this setting? + +### Overloading of data constructors and field labels + +See Gabriel's proposal: + http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt +Related to implicit arguments. +Related to name resolution. +Related to synchronization. + +### Name resolution + +Make it easier for tools (smart editor, refactoring tool) + to understand the binding structure of the code. + +### First-class polymorphism + +For records and objects. Do we want to redesign this feature? + Can we avoid the syntactic overhead of boxing them in a record? + +### Rows + +With applications to objects and polymorphic variants and effects. + (Currently, objects and variants use local constraints.) + +### Core language subtyping + +Implicit, explicit, what form? + Are coercions erased? (yes) +Related to rows. + +Implicit (pervasive) subtyping would require bi-unification (Stephen's algorithm). + +### Value restriction + +The relaxed value restriction, based on variance. +Also, see Jacques's more recent proposal? + +### Basic support for linear types, regions, borrows + +Based on recent work by Gabriel Radanne and Peter Thiemann. + +### Type error reports + +Can we produce error reports that are minimal in a certain sense? + What is the algorithmic cost? + +Can we report multiple errors? + +### Incrementality, type-inferencer API + +Can we support incremental type inference? +Can we avoid global state in the implementation? + or avoid side effects altogether? +Should we expose an abstract type of type-inferencer states? +Recomputing as little as possible when the source code changes. + +The "toplevel" (REPL) interface. The debugger. + How do they interact with the type-checker? + +### Modules + +Including first-class modules. +Should the module language and the core language be separate? + +### Implicit arguments + +### Labeled and optional arguments + +Needs a form of multi-application. +Related to synchronization (propagation). + +### Classes + +What kind of support does this require in Inferno? + +### Kinds + +Do we need some kind of support for kinds? +How would it impact the solver's API? +Type application in the syntax of kinds? + How are equations between type applications solved? +Subkinding? +Should type variables and type constructors be a single category? diff --git a/notes/meetings/2019-10-07/low-road-notes.txt b/notes/meetings/2019-10-07/low-road-notes.txt new file mode 100644 index 0000000..54ee7de --- /dev/null +++ b/notes/meetings/2019-10-07/low-road-notes.txt @@ -0,0 +1,31 @@ +Gabriel Radanne will start doing more code reviews now. + +Reviews: hmm. + + It would be nice if all Inferno-working people could help on code + reviews, not only within their subgroups. + + (the idea of in-person presentations for big PRs: Leo is not in + favor, unconvinced by the idea of travelling to Paris so often (too + many big PRs?)) + +Low road: the point is that making the code easier to understand + is a compiler-distribution feature by itself. + +Leo: larger rewrites should get RFCs just as big high-road features. + +Testing? + - Leo: Jane Street will keep building a branch of their codebase in trunk + - opamcheck: if we suspect there is a regression risk, we try to convince + Florian to run opamcheck on it + +High-road testing? Can we collect important examples of tricky OCaml +code that relies on sharing in deep ways? + +Pierre Chambart: if we had a shared typed IR it would be easier for testing. + +Testing: collect examples of things that do not work in the testsuite, +corresponding to known defects. "known-to-break" testsuite directory? + + Policy: Everytime there is a new bug report with a small repro-test, + it should go into this known-defects directory. diff --git a/notes/meetings/2019-10-07/notes.txt b/notes/meetings/2019-10-07/notes.txt new file mode 100644 index 0000000..d36d319 --- /dev/null +++ b/notes/meetings/2019-10-07/notes.txt @@ -0,0 +1,274 @@ +Redesign of the typechecker, +Monday October 7, in Paris +Minutes of the meeting + +### Participants + +See the file participants.md, + +Gabriel Scherer (GS), François Pottier (FP), Didier Remy (DR), Jacques +Garrigue (JG), Thomas Refis (RF), Xavier Leroy (XL), Leo White (LW), Armaël +Guéneau (AG), Florian Angeletti (FA), Alain Frisch (AF), Nicolás Ojeda Bär +(NOB), Pierrick Couderc PC), Frédéric Bour (FB), Gabriel Radanne (GR), +Damien Doligez (DD), Anil Madhavapeddy (AM). + + +### Why do we want to do this? + +This should be kept in mind while doing the redesign! + +# Robustness + +# For easier maintenance + +# To be more suitable for extensions + +- Including immediate extensions in mind: implicits and algebraic effects +- But also future extensions + +# More precise specification + +- including a reference implementation + +# Formalization of the kernel + +- On paper, or perhaps in Coq + +# Tooling + +- Including better support for errors + +# Better performance + +- Typechecker is actually slow (for the module system) +- There might be opportunities for optimizations +- Which are hard to do in the current state of the implementation + + +### Risks + +AF: is worried about backward compatibility, and ask whether this +should be the same language or another language. + +DD: there is a continuum between the two extremes: exactly the same language +or another language. + +LW: keeping it exactly the same would prevent us from fixing bugs. + +Someone mentions getting rid of the object system altogether... + +XL: A quote from Guy Steele: when the community wanted to simplify + JavaScript, they observed that it had already too many users. + + XL would not like OCaml to turn into JavaScript. + +XL thinks that OCaml has too many bug reports. If it were to be redone, some + features would not be there (e.g. recursive modules) or would have been + done otherwise. So he supports a clean up during the redesign. + +In conclusion, we somehow all wish it to be the next version of the same +language, and not another language, but for the majority it is clear that we +will also do some changes and that it is too early to say which ones. + +We discuss the time scale. DR suggests a target of 3 years. Even though we +understand that it could take longer to make the change more continuous for +users. AM reminds that for Multicore OCaml it has been 7 years, where most +of the ideas and design was done in the first 3 years. He recommends that we +do not communicate on the redesign of the typechecker in the community, so +as not to create expectations. + +### How to work together + +# GitHub + +LW proposes to use git for everything in the model of RFC (see for example +the discussed on unboxed types), where a proposal can be described and +commented. + +For more research-like topics, we could also open sub-directories and push +drafts of the research project and updates them at significant steps. + +There is an agreement that GitHub is a good tool to keep each other informed +of the actions going on and of their progress. + +We should do this in a private branch, since we do not want to communicate +publicly on the redesign of the typechecker... + +# Meetings + +How often should we plan meetings? This is unclear: we are probably going to +work and perhaps meet in small groups around specific precise tasks. Hence, +we would not need to meet altogether too often, perhaps twice a year to keep +us informed of the advances of the different projects. + +We could also use video conf, especially for some small group meetings. + +We should plan meeting more longer ahead (2 months) to make it easier to +find a slot that is convenient for almost all of us. We are invited to host +one of the meetings in London. + + +### Typed Intermediate Language + +We all agree on the need for a Surface Typed Intermediate Language (S-TIL) +that is closed to the Source language and the output of type inference. + +There should also be a smaller Kernel Typed Intermediate Language, but it +is unclear at this point what how far from the surface it should be: + + - It should at least remove all syntactic sugar. + + - But it should not complicate the type system just for the sake of + reducing the size of kernel. + +We should be aware of over previous experiments that ended up in using Coq +as the internal language to be able to encode everything else (cf. Zhong +Shao). Is this what we want? He encourages us to look at previous works. + +JG mentions that perhaps a compilation to Coq could be used +as a way to prove type soundness. + +Xavier wants to avoid type preserving compilation, which is difficult---even +for inlining it's not easy. + +The goal of having a Typed Intermediate Language is not for compiling but +for having a smaller language to formalize (soundness proof). We may then +remove types (or move to simpler types) for the rest of the compilation. + + +### High-level roadmap + +This describes the informal discussion we had before lunch, See the list of +high-level tasks below that we agreed on after lunch. + +This discussion was a little disorganized and is difficult to summarize. +The main points are the following. + +# Concurrency (later called synchronization) + +- Inference will have to be "concurrent" for implicits, because of the + interaction between the core level and the module level. + Just for HO unification, ie. pattern unification, this is the way it + works. + +- FP reminds that the constraints will need a clear specification. + +- Type inference is also needed for disambiguation + +- It is unclear that we could separate constraint generation and constraint + resolution, because of name disambiguation. + +# principal mode? + +- In OCaml, type inference is not principal, even in principal mode! + It is only maximal, i.e. the typechecker never take an arbitrary step. + +- There is a form of "local type inference" (e.g. for labeled arguments) + +# GADTs + +- Nested patterns with GADTs are quite tricky. (Later, at lunch, we could + not even agree on whether compiling GADT with nested patterns + into flat patterns would preserve well-typedness) + +# Difficult parts are + +- sharing + +- propagation of annotations/types + + + +### High-level tasks + +# S-TIL (Surface Typed Intermediate Language) + +Participants: DR, JG, TR. + +==> Looking for a leader + +(We forgot to come back at this task at the end of the discussion...) + +- One argument was whether we could define S-TIL now or have to wait for + advances on other tasks... + + +# Sharing + +Participants: DR (leading), DP, GR, JG, LW + +- This refers to all forms of sharing in types: type abbreviations, sharing + for object types, sharing for polytypes, type equalities for module + types. + +- Subtopics are: principality, polytypes. + + +# Synchronization + +Participants: DR (leading), DP, GR, JG, LW, GS (name resolution) + +- This refers to the propagation of information learned during type + inference: propagation of source type annotations, but also using the type + inferred for a function to help type the arguments or propagate the type a + a let-binding as we would propagate its type annotation if it had one. + +- This will introduce a form of synchronization during constraint solving. + +- We also need to treat constraints concurrently, to solve them in an order + that cannot be determined in advance. + +- Subtopics are also: principality, polytypes. + +- Name resolution should fit there as well. + + +# Integration between Core & Module languages + +Participants: GR (leading), DR, JG, LW (type inference only) + +Two subtopics that can be looked at separately are + + - type inference + + - type checking + + +# Tooling & Incrementality + +Participants: GR (leading), FB, TR, FA, FP, DR, GS + +- Type errors + +- Keeping enough information in the S-TIL for better tooling + +- A lot of engineering here + + + Isolating the sate of the typechecker (i.e. we should be able to run + multiple instances of the typechecker) + + + toplevel interface + + + loading of global modules + + + debugging tools + + +# Rows, Subtyping & Effects: LW (leading), DR, FP, JG + + +# GADTs: JG (leading), DR, FP + + +# Labeled arguments: JG (leading), DR, LW + + +# Classes & Objects: JG, LW, DR + +==> Who is leading? JG? + +# Kind system: DR, FP, GR, LW, SD + +- Including subkinding + +==> Who is leading? diff --git a/notes/meetings/2019-10-07/participants.md b/notes/meetings/2019-10-07/participants.md new file mode 100644 index 0000000..0c0c96c --- /dev/null +++ b/notes/meetings/2019-10-07/participants.md @@ -0,0 +1,40 @@ +List of participants + +* Sébastien, with vegetarian lunch + +* Gabriel Scherer, skipping lunch. + +* François Pottier, present at lunch (no requirements). + I will have to attend another meeting between 10am and 11:30am, + but otherwise will be available. + +* Didier Remy, present at lunch if at Il goto (or with a no sodium option) + +* Jacques Garrigue, with lunch + +* Thomas Refis, present at lunch if at Il goto (vegetarian options have been + disappointing at other places) + +* Xavier Leroy, with lunch (no strong requirements). + +* Leo White, with lunch + +* Armaël Guéneau, with lunch (preference for vegetarian lunch unless at + The Touareg) + +* Florian Angeletti, with lunch (no requirements) + +* Alain Frisch, with lunch (preference for vegetarian lunch unless at + The Touareg) + +* Nicolás Ojeda Bär, with lunch (no strong requirements). + +* Pierrick Couderc, with lunch (no requirements) + +* Frédéric Bour, without lunch + +* Gabriel Radanne, with lunch (no requirements) + +* Damien Doligez, with lunch (no requirements) + +* Anil Madhavapeddy, with lunch (no requirements) diff --git a/notes/meetings/2019-10-07/schedule.md b/notes/meetings/2019-10-07/schedule.md new file mode 100644 index 0000000..6c81a69 --- /dev/null +++ b/notes/meetings/2019-10-07/schedule.md @@ -0,0 +1,26 @@ +Tentative schedule for this research meeting + +Suggestions by Jacques: +* discuss the framework for changes in the typechecker / type system, particularly whether to go for a low road / high road approach (see notes of previous meeting) +* technical discussions on the type system and its implementation + +Suggestions by François: + +* discuss the low road (trying to set up a concrete lists of things + to do and people who can do them) + +* discuss the high road (with similar aims) + +* discuss the idea of introducing one or more explicitly typed + languages (equipped with reference type-checkers that perform + no type inference) + +Suggestions by Leo: + +* Discuss the aims of the rewriting project + +* Discuss how we want to collaborate on the typechecker, i.e. how should we communicate and how should we make decisions + +* Discuss how we are going to track/report progress on the various lines of work + +* Try to write down some kind of plan/schedule of work From 7a77d88d1efd6d1479db58c3a3d31a1e6c63ea34 Mon Sep 17 00:00:00 2001 From: Leo White Date: Thu, 24 Oct 2019 10:40:57 +0100 Subject: [PATCH 2/5] Split out the research parts into their own repo --- Readme.md | 42 +-- designs/Readme.md | 3 - notes/Readme.md | 4 - notes/meetings/2019-09-05/notes.txt | 122 -------- notes/meetings/2019-09-16/notes.txt | 60 ---- notes/meetings/2019-10-07/inferno-roadmap.txt | 127 -------- notes/meetings/2019-10-07/low-road-notes.txt | 31 -- notes/meetings/2019-10-07/notes.txt | 274 ------------------ notes/meetings/2019-10-07/participants.md | 40 --- notes/meetings/2019-10-07/schedule.md | 26 -- research/Readme.md | 4 - rfcs/Readme.md | 2 +- 12 files changed, 9 insertions(+), 726 deletions(-) delete mode 100644 designs/Readme.md delete mode 100644 notes/Readme.md delete mode 100644 notes/meetings/2019-09-05/notes.txt delete mode 100644 notes/meetings/2019-09-16/notes.txt delete mode 100644 notes/meetings/2019-10-07/inferno-roadmap.txt delete mode 100644 notes/meetings/2019-10-07/low-road-notes.txt delete mode 100644 notes/meetings/2019-10-07/notes.txt delete mode 100644 notes/meetings/2019-10-07/participants.md delete mode 100644 notes/meetings/2019-10-07/schedule.md delete mode 100644 research/Readme.md diff --git a/Readme.md b/Readme.md index 029dff9..4a21712 100644 --- a/Readme.md +++ b/Readme.md @@ -1,19 +1,16 @@ -# OCaml RFCs, notes and paper drafts +# OCaml RFCs -This repository is for work collaborating on the OCaml -type-checker. It is split into four folders: +This repository is for proposals to change the OCaml language +or the internals of its compiler. It currently has a single folder: - - rfcs - - designs - - research - - notes +- rfcs ## RFCs -The rfcs folder is for accepted proposals for changes to the language -or (existing) type-checker. Files should be created in this directory -via pull requests, and the merging of those pull requests should only -happen once there is consensus to accept the change in principle. +The rfcs folder is for accepted proposals for changes to the +language. Files should be created in this directory via pull requests, +and the merging of those pull requests should only happen once there +is consensus to accept the change in principle. We'll adjust and adapt the process as we go, but as a starting point RFCs should provide: @@ -27,26 +24,3 @@ RFCs should provide: Then we can discuss the proposal in the comments of the pull request and, once consensus is reached, either merge or close the pull request. - -## Designs - -The designs folder is intended for design documents related to the -"high road" type-checker. Such documents can be added and evolved via -pull request to enable people to easily collaborate and discuss these -designs. - -Merging a document to this directory has no particular meaning so -people can make up their own minds about when to merge such pull -requests. - -## Research - -The research folder is for draft papers and other research materials -(e.g. formalisations) related to the "high road" type-checker. As with -the designs directory, the aim of this folder is to aid collaboration -and people should feel free to use the folder as they see fit. - -## Notes - -The notes folder is for meeting minutes, reports, and other -administrative notes related to working on the type-checker. diff --git a/designs/Readme.md b/designs/Readme.md deleted file mode 100644 index 9e821ff..0000000 --- a/designs/Readme.md +++ /dev/null @@ -1,3 +0,0 @@ -# Designs - -This folder is for design documents for the "high road" type-checker diff --git a/notes/Readme.md b/notes/Readme.md deleted file mode 100644 index 6ad4f2d..0000000 --- a/notes/Readme.md +++ /dev/null @@ -1,4 +0,0 @@ -# Notes - -This directory is for agenda, minutes, notes and reports related to -the type-checker. diff --git a/notes/meetings/2019-09-05/notes.txt b/notes/meetings/2019-09-05/notes.txt deleted file mode 100644 index 4ce0317..0000000 --- a/notes/meetings/2019-09-05/notes.txt +++ /dev/null @@ -1,122 +0,0 @@ -Gabriel: my main proposal is, for the "rebuild the type-checker from -scratch" part, to create a repository on top of Inferno¹ that -implements a ML language (MidML, MoyenML?). We start from the nano-ML -example used in Inferno itself (as a typical example of Inferno client -program). We add important features from the OCaml type system, one -after the other, which requires adding new features to Inferno -itself -- as clear and general as possible to express the desired ML -type-inference features. This is useful to get a clean theory (of the -ML feature, of the inference support for it), looking at interactions -with other features, with a reference implementation that keeps -growing. And for some of those features we can publish papers. - -The notes below include at the end a list of features to which we -could think of during the discussion -- a decomposition of OCaml's -type system in smaller. - -¹ Inferno is François Pottier's type-inference library. I (Gabriel) -would recommend reading its implementation -- I did it from the -high-level layers to the low-level layers. It is hosted on François's -favorite open-source closed-development platform: -https://gitlab.inria.fr/fpottier/inferno . - ----- - -# Global approach - -We should do "incremental" and "from scratch" ("big bang") in -parallel, with communication between both. - - -Incrémental - -- Start from the TODO list and Thomas' and Leo's ideas - - Jacques: I'm less interested in that part. - - Gabriel: but we will ask you to review what people have done. - - Jacques: I was mostly speaking of the refactoring. Conceptual - changes (like adding scopes) are more interesting to me. - - Gabriel: I think sometimes conceptual changes emerge - from refactoring, when done carefully. - - -Big-bang - -- Use Inferno + MoyenML as an experimental vehicle to study the essence - of OCaml type-system features, layer by layer. - - How to bridge the gap between MoyenML-gone-big and OCaml? One - possibly, if required (it's not pretty), would be to replace the - lowest layers of Inferno (Union-Find, levels/generalization etc.) by - the OCaml type representation and its manipulation code - (atype/btype/ctype). - - (Didier: the interface of Inferno is nice, but in the implementation - I'm not convinced by the very-effectful approach of François. Maybe - we don't want that on the long term, it may not scale to more - advanced constraints.) - -Core language: what's an explicitly-typed representation that the -OCaml type-inference engine could produce as result? - - Jacques: about that, I was thinking of using CIC (Calculus of - Inductive Constructions), to represent the values of OCaml's - intermediate representation with dependent types. -- use Gallina as - intermediate representation. - - -## Feature map to add to Inferno + MidML - -### Term language - -- [data]: algebraic datatypes - + pattern-matching - -- type abbreviations - -- label overloading (constructors and fields) - (depends on [data]) - - Gabriel: http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt - -- scoped type variables - -- first-class polymorphism - -- GADTs - (depends on [data]) - -- structural rows - (objects, polymorphic variants) - -- checking .mli - (the inclusion check between type declarations) - -- value restriction? - + partial generalization - + relaxed value restriction (variance) - -### Error report (entry later added by DR) - -- minimal error report needed - -- explore other solutions - + Nathanael proposal: report unification errors in terms of typing - constraints from the soruce. - -### Propagation of source typing annotations (later added by DR) - -- Preprocessing step? - -- During constraint generation? - -- Other: new form of constraints? - -### Modules - -- modules -- first-class modules - diff --git a/notes/meetings/2019-09-16/notes.txt b/notes/meetings/2019-09-16/notes.txt deleted file mode 100644 index 41d7321..0000000 --- a/notes/meetings/2019-09-16/notes.txt +++ /dev/null @@ -1,60 +0,0 @@ -Discussion of the "high road" (Xavier's term): -* rebuilding something like OCaml (possibly simpler, cleaner) - from scratch, starting from Inferno. - -Discussion of the "low road": -* refactoring the existing type-checker -* problem: the reviewing load will be high (especially for Jacques) -* Jacques says it's ok to discuss the high-level ideas, - but painful/impossible to review the code itself -* Xavier wants discussion on high-level ideas, more than one the code - -Didier asks whether the two roads are supposed to meet eventually. - -Jacques says that Thomas Refis is the one who is currently working most -on the "low road" (the type-checker). E.g. Thomas has clarified the notion -of "scope", etc. Also, Leo White has done some work on the type-checking -of objects and classes (not yet reviewed and merged; it was discussed in -March 2019 but the requested follow-up work was not done). - -François asks whether it would be worthwhile to work on strengthening -the test suite. -* Xavier mentions fuzzing; - it could be very useful if the type-checker contains assertions -* Florian & Sébastien had an Outreachy intern (Oxana) - who worked on checking the coverage of the test suite. - It would be good to continue and finalize this work. - (Gabriel had an Outreachy intern (Ulugbek) working on quickchecking - the compiler by randomly generating well-typed programs. Could also - be of interest.) - -Didier asks whether the code should be commented (either inside the code, -or in a separate file containing explanations). Damien notes that some PRs -begin with a long explanation. But it is not visible in a cloned repository! - -Sébastien asks where we should invest our energy and expertise. (Who should -work on what?) E.g. Sébastien might be interested in working on refactoring -the type-checker. - -What are our forces? -* Gabriel Radanne -* Leo White? -* Thomas Refis??? (will be doing a PhD with Didier, hence *scientific work*) -* Sébastien could have some time after March 2020 - -Someone mentions the technology that allows rapidly rebuilding all opam packages so as to test a change. Do we master it? Florian knows about "opam-check" (Damien's tool); works, but not yet parallel. (It should be!) Armaël & Gabriel have worked on another approach (also on top of opam) which is parallel. Still experimental, to be finished. And what about the OCaml Labs tools? Damien notes that many opam packages have broken dependencies, which complicates things. (Would it be easy to automatically check that no dependencies are missing in a package description?) (Damien notes that it would be useful to know which modules are installed by which packages.) - -Didier asks to what extent it is permitted to change the type system (intentionally). E.g. we might wish to change the treatment of first-class polymorphism. (Didier has the "high road" in mind.) There seems to be -consensus that in the high road it is permitted & desirable to do so. - -Some discussion of -principal, which is needed to infer principal types in the presence of first-class polymorphism, but is costly because it imposes copying instead of sharing. - -Sébastien notes the need to organize another meeting, including developers outside Inria, and to keep everyone in the loop. Need to do so soon, and decide (during that meeting) who will be working on each road. - -Some discussion of the idea of introducing one or more explicitly-typed intermediate languages. (Integrate the work by Pierrick Couderc?) It or they should be easy to type-check; the code of the checker should be close to a declarative specification. Jacques would like this language to be formally defined in Coq. Armaël notes that one could even have a type-checker for it implemented in Coq. (Analogous to what the F* people did?) Jacques and Xavier would like to do a proof of type soundness (preferably for a low-level calculus). A model of OCaml types in Iris could be interesting. Xavier wants types to have meaning. (This is the "high high road".) - -TODO: - Send a proposal for an open meeting. - Scheduled together with a regular developers meeting? - (Include Pierrick + Gabriel.) - Propose a 2-day meeting. Starting late, finishing early. Research + dev. diff --git a/notes/meetings/2019-10-07/inferno-roadmap.txt b/notes/meetings/2019-10-07/inferno-roadmap.txt deleted file mode 100644 index 365b759..0000000 --- a/notes/meetings/2019-10-07/inferno-roadmap.txt +++ /dev/null @@ -1,127 +0,0 @@ -An Inferno roadmap (also known as "the high road"). -I will also commit a version of this to the Inferno repository. --- FP - -### Universal quantification in constraints - -### Type annotations - -This includes "scoped type variables", that is, the ability (or necessity?) -to explicitly bind type variables. - -It should be possible to bind a type variable either existentially or universally. - -As a side product, we get the ability of comparing an inferred type -with a declared type. - -### Algebraic data types and pattern-matching - -### GADTs - -Need to reason under local equations between types. -Need to detect ambivalent types. -In the presence of modules and abstract types, - need a notion of "compatibility" (an approximation of equality). - -### Type abbreviations - -Toplevel or local? -Recursive? -The subject of Carine Morel's recent internship. - -### Principality, sharing - -Understand how type information is shared internally and is propagated. -Related to GADTs, first-class polymorphism, type abbreviations. - -### Propagation of type information (synchronization) - -Can it be viewed as a preprocessing step? -Can it be performed at the same time as constraint generation? -Do we need new forms of constraints? - (somehow allowing feedback from constraint resolution - into constraint generation) -What is the role of type annotations in this setting? - -### Overloading of data constructors and field labels - -See Gabriel's proposal: - http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt -Related to implicit arguments. -Related to name resolution. -Related to synchronization. - -### Name resolution - -Make it easier for tools (smart editor, refactoring tool) - to understand the binding structure of the code. - -### First-class polymorphism - -For records and objects. Do we want to redesign this feature? - Can we avoid the syntactic overhead of boxing them in a record? - -### Rows - -With applications to objects and polymorphic variants and effects. - (Currently, objects and variants use local constraints.) - -### Core language subtyping - -Implicit, explicit, what form? - Are coercions erased? (yes) -Related to rows. - -Implicit (pervasive) subtyping would require bi-unification (Stephen's algorithm). - -### Value restriction - -The relaxed value restriction, based on variance. -Also, see Jacques's more recent proposal? - -### Basic support for linear types, regions, borrows - -Based on recent work by Gabriel Radanne and Peter Thiemann. - -### Type error reports - -Can we produce error reports that are minimal in a certain sense? - What is the algorithmic cost? - -Can we report multiple errors? - -### Incrementality, type-inferencer API - -Can we support incremental type inference? -Can we avoid global state in the implementation? - or avoid side effects altogether? -Should we expose an abstract type of type-inferencer states? -Recomputing as little as possible when the source code changes. - -The "toplevel" (REPL) interface. The debugger. - How do they interact with the type-checker? - -### Modules - -Including first-class modules. -Should the module language and the core language be separate? - -### Implicit arguments - -### Labeled and optional arguments - -Needs a form of multi-application. -Related to synchronization (propagation). - -### Classes - -What kind of support does this require in Inferno? - -### Kinds - -Do we need some kind of support for kinds? -How would it impact the solver's API? -Type application in the syntax of kinds? - How are equations between type applications solved? -Subkinding? -Should type variables and type constructors be a single category? diff --git a/notes/meetings/2019-10-07/low-road-notes.txt b/notes/meetings/2019-10-07/low-road-notes.txt deleted file mode 100644 index 54ee7de..0000000 --- a/notes/meetings/2019-10-07/low-road-notes.txt +++ /dev/null @@ -1,31 +0,0 @@ -Gabriel Radanne will start doing more code reviews now. - -Reviews: hmm. - - It would be nice if all Inferno-working people could help on code - reviews, not only within their subgroups. - - (the idea of in-person presentations for big PRs: Leo is not in - favor, unconvinced by the idea of travelling to Paris so often (too - many big PRs?)) - -Low road: the point is that making the code easier to understand - is a compiler-distribution feature by itself. - -Leo: larger rewrites should get RFCs just as big high-road features. - -Testing? - - Leo: Jane Street will keep building a branch of their codebase in trunk - - opamcheck: if we suspect there is a regression risk, we try to convince - Florian to run opamcheck on it - -High-road testing? Can we collect important examples of tricky OCaml -code that relies on sharing in deep ways? - -Pierre Chambart: if we had a shared typed IR it would be easier for testing. - -Testing: collect examples of things that do not work in the testsuite, -corresponding to known defects. "known-to-break" testsuite directory? - - Policy: Everytime there is a new bug report with a small repro-test, - it should go into this known-defects directory. diff --git a/notes/meetings/2019-10-07/notes.txt b/notes/meetings/2019-10-07/notes.txt deleted file mode 100644 index d36d319..0000000 --- a/notes/meetings/2019-10-07/notes.txt +++ /dev/null @@ -1,274 +0,0 @@ -Redesign of the typechecker, -Monday October 7, in Paris -Minutes of the meeting - -### Participants - -See the file participants.md, - -Gabriel Scherer (GS), François Pottier (FP), Didier Remy (DR), Jacques -Garrigue (JG), Thomas Refis (RF), Xavier Leroy (XL), Leo White (LW), Armaël -Guéneau (AG), Florian Angeletti (FA), Alain Frisch (AF), Nicolás Ojeda Bär -(NOB), Pierrick Couderc PC), Frédéric Bour (FB), Gabriel Radanne (GR), -Damien Doligez (DD), Anil Madhavapeddy (AM). - - -### Why do we want to do this? - -This should be kept in mind while doing the redesign! - -# Robustness - -# For easier maintenance - -# To be more suitable for extensions - -- Including immediate extensions in mind: implicits and algebraic effects -- But also future extensions - -# More precise specification - -- including a reference implementation - -# Formalization of the kernel - -- On paper, or perhaps in Coq - -# Tooling - -- Including better support for errors - -# Better performance - -- Typechecker is actually slow (for the module system) -- There might be opportunities for optimizations -- Which are hard to do in the current state of the implementation - - -### Risks - -AF: is worried about backward compatibility, and ask whether this -should be the same language or another language. - -DD: there is a continuum between the two extremes: exactly the same language -or another language. - -LW: keeping it exactly the same would prevent us from fixing bugs. - -Someone mentions getting rid of the object system altogether... - -XL: A quote from Guy Steele: when the community wanted to simplify - JavaScript, they observed that it had already too many users. - - XL would not like OCaml to turn into JavaScript. - -XL thinks that OCaml has too many bug reports. If it were to be redone, some - features would not be there (e.g. recursive modules) or would have been - done otherwise. So he supports a clean up during the redesign. - -In conclusion, we somehow all wish it to be the next version of the same -language, and not another language, but for the majority it is clear that we -will also do some changes and that it is too early to say which ones. - -We discuss the time scale. DR suggests a target of 3 years. Even though we -understand that it could take longer to make the change more continuous for -users. AM reminds that for Multicore OCaml it has been 7 years, where most -of the ideas and design was done in the first 3 years. He recommends that we -do not communicate on the redesign of the typechecker in the community, so -as not to create expectations. - -### How to work together - -# GitHub - -LW proposes to use git for everything in the model of RFC (see for example -the discussed on unboxed types), where a proposal can be described and -commented. - -For more research-like topics, we could also open sub-directories and push -drafts of the research project and updates them at significant steps. - -There is an agreement that GitHub is a good tool to keep each other informed -of the actions going on and of their progress. - -We should do this in a private branch, since we do not want to communicate -publicly on the redesign of the typechecker... - -# Meetings - -How often should we plan meetings? This is unclear: we are probably going to -work and perhaps meet in small groups around specific precise tasks. Hence, -we would not need to meet altogether too often, perhaps twice a year to keep -us informed of the advances of the different projects. - -We could also use video conf, especially for some small group meetings. - -We should plan meeting more longer ahead (2 months) to make it easier to -find a slot that is convenient for almost all of us. We are invited to host -one of the meetings in London. - - -### Typed Intermediate Language - -We all agree on the need for a Surface Typed Intermediate Language (S-TIL) -that is closed to the Source language and the output of type inference. - -There should also be a smaller Kernel Typed Intermediate Language, but it -is unclear at this point what how far from the surface it should be: - - - It should at least remove all syntactic sugar. - - - But it should not complicate the type system just for the sake of - reducing the size of kernel. - -We should be aware of over previous experiments that ended up in using Coq -as the internal language to be able to encode everything else (cf. Zhong -Shao). Is this what we want? He encourages us to look at previous works. - -JG mentions that perhaps a compilation to Coq could be used -as a way to prove type soundness. - -Xavier wants to avoid type preserving compilation, which is difficult---even -for inlining it's not easy. - -The goal of having a Typed Intermediate Language is not for compiling but -for having a smaller language to formalize (soundness proof). We may then -remove types (or move to simpler types) for the rest of the compilation. - - -### High-level roadmap - -This describes the informal discussion we had before lunch, See the list of -high-level tasks below that we agreed on after lunch. - -This discussion was a little disorganized and is difficult to summarize. -The main points are the following. - -# Concurrency (later called synchronization) - -- Inference will have to be "concurrent" for implicits, because of the - interaction between the core level and the module level. - Just for HO unification, ie. pattern unification, this is the way it - works. - -- FP reminds that the constraints will need a clear specification. - -- Type inference is also needed for disambiguation - -- It is unclear that we could separate constraint generation and constraint - resolution, because of name disambiguation. - -# principal mode? - -- In OCaml, type inference is not principal, even in principal mode! - It is only maximal, i.e. the typechecker never take an arbitrary step. - -- There is a form of "local type inference" (e.g. for labeled arguments) - -# GADTs - -- Nested patterns with GADTs are quite tricky. (Later, at lunch, we could - not even agree on whether compiling GADT with nested patterns - into flat patterns would preserve well-typedness) - -# Difficult parts are - -- sharing - -- propagation of annotations/types - - - -### High-level tasks - -# S-TIL (Surface Typed Intermediate Language) - -Participants: DR, JG, TR. - -==> Looking for a leader - -(We forgot to come back at this task at the end of the discussion...) - -- One argument was whether we could define S-TIL now or have to wait for - advances on other tasks... - - -# Sharing - -Participants: DR (leading), DP, GR, JG, LW - -- This refers to all forms of sharing in types: type abbreviations, sharing - for object types, sharing for polytypes, type equalities for module - types. - -- Subtopics are: principality, polytypes. - - -# Synchronization - -Participants: DR (leading), DP, GR, JG, LW, GS (name resolution) - -- This refers to the propagation of information learned during type - inference: propagation of source type annotations, but also using the type - inferred for a function to help type the arguments or propagate the type a - a let-binding as we would propagate its type annotation if it had one. - -- This will introduce a form of synchronization during constraint solving. - -- We also need to treat constraints concurrently, to solve them in an order - that cannot be determined in advance. - -- Subtopics are also: principality, polytypes. - -- Name resolution should fit there as well. - - -# Integration between Core & Module languages - -Participants: GR (leading), DR, JG, LW (type inference only) - -Two subtopics that can be looked at separately are - - - type inference - - - type checking - - -# Tooling & Incrementality - -Participants: GR (leading), FB, TR, FA, FP, DR, GS - -- Type errors - -- Keeping enough information in the S-TIL for better tooling - -- A lot of engineering here - - + Isolating the sate of the typechecker (i.e. we should be able to run - multiple instances of the typechecker) - - + toplevel interface - - + loading of global modules - - + debugging tools - - -# Rows, Subtyping & Effects: LW (leading), DR, FP, JG - - -# GADTs: JG (leading), DR, FP - - -# Labeled arguments: JG (leading), DR, LW - - -# Classes & Objects: JG, LW, DR - -==> Who is leading? JG? - -# Kind system: DR, FP, GR, LW, SD - -- Including subkinding - -==> Who is leading? diff --git a/notes/meetings/2019-10-07/participants.md b/notes/meetings/2019-10-07/participants.md deleted file mode 100644 index 0c0c96c..0000000 --- a/notes/meetings/2019-10-07/participants.md +++ /dev/null @@ -1,40 +0,0 @@ -List of participants - -* Sébastien, with vegetarian lunch - -* Gabriel Scherer, skipping lunch. - -* François Pottier, present at lunch (no requirements). - I will have to attend another meeting between 10am and 11:30am, - but otherwise will be available. - -* Didier Remy, present at lunch if at Il goto (or with a no sodium option) - -* Jacques Garrigue, with lunch - -* Thomas Refis, present at lunch if at Il goto (vegetarian options have been - disappointing at other places) - -* Xavier Leroy, with lunch (no strong requirements). - -* Leo White, with lunch - -* Armaël Guéneau, with lunch (preference for vegetarian lunch unless at - The Touareg) - -* Florian Angeletti, with lunch (no requirements) - -* Alain Frisch, with lunch (preference for vegetarian lunch unless at - The Touareg) - -* Nicolás Ojeda Bär, with lunch (no strong requirements). - -* Pierrick Couderc, with lunch (no requirements) - -* Frédéric Bour, without lunch - -* Gabriel Radanne, with lunch (no requirements) - -* Damien Doligez, with lunch (no requirements) - -* Anil Madhavapeddy, with lunch (no requirements) diff --git a/notes/meetings/2019-10-07/schedule.md b/notes/meetings/2019-10-07/schedule.md deleted file mode 100644 index 6c81a69..0000000 --- a/notes/meetings/2019-10-07/schedule.md +++ /dev/null @@ -1,26 +0,0 @@ -Tentative schedule for this research meeting - -Suggestions by Jacques: -* discuss the framework for changes in the typechecker / type system, particularly whether to go for a low road / high road approach (see notes of previous meeting) -* technical discussions on the type system and its implementation - -Suggestions by François: - -* discuss the low road (trying to set up a concrete lists of things - to do and people who can do them) - -* discuss the high road (with similar aims) - -* discuss the idea of introducing one or more explicitly typed - languages (equipped with reference type-checkers that perform - no type inference) - -Suggestions by Leo: - -* Discuss the aims of the rewriting project - -* Discuss how we want to collaborate on the typechecker, i.e. how should we communicate and how should we make decisions - -* Discuss how we are going to track/report progress on the various lines of work - -* Try to write down some kind of plan/schedule of work diff --git a/research/Readme.md b/research/Readme.md deleted file mode 100644 index 1bf2cd7..0000000 --- a/research/Readme.md +++ /dev/null @@ -1,4 +0,0 @@ -# Research - -This folder is for collaborating on draft papers and other research -material related to the type-checker. diff --git a/rfcs/Readme.md b/rfcs/Readme.md index bad73c4..2bd1aa8 100644 --- a/rfcs/Readme.md +++ b/rfcs/Readme.md @@ -1,3 +1,3 @@ # RFCs -This directory is for accepted proposals related to the type-checker +This directory is for accepted proposals From f42b34878f5ab118ac429b18ebfe2d4116710698 Mon Sep 17 00:00:00 2001 From: Leo White Date: Thu, 24 Oct 2019 16:24:01 +0100 Subject: [PATCH 3/5] Add functor units, functor packs and recursive packs RFC --- ...units-functor-packs-and-recursive-packs.md | 415 ++++++++++++++++++ 1 file changed, 415 insertions(+) create mode 100644 rfcs/functor-units-functor-packs-and-recursive-packs.md diff --git a/rfcs/functor-units-functor-packs-and-recursive-packs.md b/rfcs/functor-units-functor-packs-and-recursive-packs.md new file mode 100644 index 0000000..7e88a92 --- /dev/null +++ b/rfcs/functor-units-functor-packs-and-recursive-packs.md @@ -0,0 +1,415 @@ +# Functor units, functor packs and recursive packs + +## Summary + +OCaml's module system has great facilities for structuring your code +in the large. It supports encapsulation, parameterisation and even +recursion. Unfortunately not all of these facilities are available +across multiple files. The `-pack` feature allows you to group +multiple files together under an interface, but there isn't really +a way to parameterise a group of files all at once, or to allow +recursion amongst a group of files. + +This proposal aims to support parameterisation of groups of files +and recursion amongst groups of files by extending functors and +recursive modules to work on groups of files via the `-pack` +mechanism. + +## Motivation + +Parameterising a group of top-level modules by another module is a +commonly requested feature. Doing it by hand can be quite painful +because each module must be parameterised by both the main parameter +and the applied forms of all the other parameterised modules that +it depends on. For example, if `B` depends on `A` and they are both +parameterised by `Param` then you get the equivalent of: + +```ocaml +module A (Param : S) : T with module Param := S + +module B (Param : S) (A : T with module Param := S) : + R with module Param := S and module A := A +``` + +which gets more awkward the more modules are included. + +Recursion between modules is useful in a number of situations. One +particular use case is constructing mutually recursive types without +exposing the definitions of each type to the other. Recursive modules +allow code of the form: + +```ocaml +module rec A : sig + type t + val of_b : B.t -> t +end = struct + type t = B.t + let of_b b = b +end + +and B : sig + type t + val of_as : A.t list -> t +end = struct + type t = A.t list + let of_as as = as +end +``` + +This kind of code is currently unavailable between top-level modules. + +Whilst there is general uncertainty over what the final form of +recursive modules in OCaml should look like, I think that this use +case is one that should always be supported, and so it is worthwhile +to extend it to top-level modules. + +## Details + +This proposal is split into 4 sub-proposals: + +1. Functor units - allowing compilation units to be functors +2. Functor packs - allowing packed units to be functors +3. Recursive interfaces - allowing groups of module interfaces to + refer to each other +4. Recursive packs - allowing packed units to be mutually recursive. + +Each of these parts can be upstreamed before the later parts are +completed. + +### Supporting ml files as functors + +The aim for this part is to allow an .ml file to be compiled to a functor +rather than a structure. The idea is that the user: + +1. Writes ".mli" files for the parameters of the functor and compiles them + with a command-line option like "-parameter-of Foo" where "Foo" is the + name of the functor. + +2. Compiles the ".mli" and ".ml" files for the `Foo` module with options + like "-parameter Arg" where "Arg" is the name of one of the + parameters. "Arg" will be available as a module in these files, just + as it would be for the body of a functor. The module type in the + resulting ".cmi" file is that of a functor taking the given + parameters. The code in the resulting ".cmo"/".cmx" file is also that + of a functor taking the given parameters. + +Note that the restrictions on what can appear in the body of a functor, +e.g. unwrapping first-class modules must be applied to such modules. + +Any attempt to use an interface compiled with "-parameter-of Foo" +outside of `Foo` should give a clear error message. + +### Supporting packed functors + +The aim for this part is to extend the support in the previous section +to allow packing to be used for creating a functor out of multiple +files. This is achieved by allowing the `-parameter` command-line argument +to be passed when packing a file. + +The idea is that the user: + +1. Writes ".mli" files for the parameters of the functor and compiles + them with a command-line option like "-parameter-of Pck". + +2. Compiles the ".mli" and ".ml" files for the modules in the body of the + functor with an option like "-for-pack Pck(Arg)", where `Pck` is the + name of the packed functor and `Arg` is the name of its parameter. + +3. (optional) Compiles the ".mli" of the intended interface for the + packed functor with "-parameter Arg" as an option. This file + describes the return type of the functor, and so it has access to the + parameter modules, but not the component modules of the pack. + +3. Runs "ocamlc -pack" passing it "-parameter" options for each + parameter, and the ".cmi" and ".cmo" files of the pack's + sub-modules. This then creates a ".cmi" and ".cmo" file for a module + `Pck` that is a functor from the parameters modules to a module + containing all the sub-modules. + +So if we have a parameter `Arg`, and body modules `Foo` and `Bar`, where +`Bar` depends on `Foo`, then we essentially compile `Foo` to: + +```ocaml +module Foo (Arg : ...arg.mli...) : sig + ...foo.mli... +end = struct + ...foo.ml... +end +``` + +and `Bar` to: + +```ocaml +module Bar (Arg : ...arg.mli...) (Foo : ...foo.mli...) : sig + ...bar.mli... +end = struct + ...bar.ml... +end +``` + +and finally create `Pck` equivalent to: + +```ocaml +module Pck (Arg : ...arg.mli...) : sig + module Foo : sig ...foo.mli... end + module Bar : sig ...bar.mli... end +end = struct + module Foo = Foo(Arg) + module Bar = Bar(Arg)(Foo) +end +``` + +Although this will all be implemented at the level of `Lambda` code rather +than elaborating to actual OCaml syntax. + +### Supporting mutually recursive interfaces + +The aim of this part is to support modules having mutually recursive +interfaces. The idea is for users to compile these mutually recursive +".mli" files with a single call to ocamlc passing an option "-recursive" +along with all the ".mli" files at once. This command will essentially +use `Typemod.transl_recmodule_modtypes` on all the files. When comparing +an implementation with one of these interfaces the compiler would +essentially use `Typemod.check_recmodule_inclusion`. + +### Supporting packed recursive modules + +The aim for this part is to allow packing to be used for creating a set +of mutually recursive modules out of multiple files. The idea is that +the user: + +1. Compiles all the ".mli" files using "-for-recursive-pack Pck". If the + interfaces are mutually recursive then they will also need to use + "-recursive" from the previous section. For now they will also need + to use "-opaque" to avoid warnings about missing cmx files. + +2. Compiles each "ml" file for the sub-modules. The cmo/cmx file + optionally contains a "module shape" for the module, as computed by + `Translmod.init_shape`. + +3. Runs "ocamlc -pack-recursive" passing it the ".cmo" files for all the + modules. This should construct the resulting module in a manner + similar to `Translmod.compile_recmodule`. + +### Interface in dune + +These features provide more options for how files are compiled. It is +worth considering how such features might be presented to users +through their build systems. I'll describe a potential interface for +dune, but I would expect the presentation of these features in other +build systems to be similar. + +#### Exposing `-pack` + +Packing could be exposed through dune using a `(pack)` construct +in the list of modules of a library or executable. For example: + +``` +(library ( + (name foo) + (modules + (pack ((name bar) (modules (a b))))))) +``` + +The user can provide a `bar.mli` file that is the interface of the pack, +but they cannot provide a `bar.ml`. + +#### Exposing `-parameter` + +Adding a parameter to a module would be exposed using a `parameter` +field in an entry in the `modules` list: + +``` +(library ( + (name foo) + (modules ( + a + ((name b) (parameters (arg))))))) +``` + +This would expect an `arg.mli` to describe the parameter, which would be +built with `-parameter-of Bar`. `b.mli` would be built with `-parameter +Arg`. + +#### Exposing `-pack` with `-parameter` + +Packed functors could be exposed using a `(parameters ...)` field in +the pack construct. For example: + +``` +(library ( + (name foo) + (modules + (pack ((name bar) (parameters (arg)) (modules (a b))))))) +``` + +This would expect an `arg.mli` to describe the parameter, which would be +built with `-parameter-of Bar`. + +#### Exposing `-pack-recursive` + +Recursive packs could be exposed using a `(recursive true)` field in +the pack construct. For example: + +``` +(library ( + (name foo) + (modules + (pack ((name Foo) (modules (A B)) (recursive true)))))) +``` + +#### Exposing these features directly on libraries + +In the long run we might also want to expose these features as part +of the library stanza, for creating a library module as a pack. We +could do that with a `(packed true)` field on libraries as well +as `(parameters ...)` and `(recursive true)` fields -- both of which +would imply `(packed true)`: + +``` +(library ( + (name foo) + (parameters (arg)))) +``` + +### Interaction with other tooling + +#### Merlin support + +For the functor packs I don't think merlin needs any special support +to work fine. The only thing is that the restrictions on first-class +modules when used in the body of a functor would not be reported as +errors. Adding a flag to `.merlin` files would allow that to be +addressed. + +For recursive packs merlin would need a couple of small changes so +that it would allow a module to refer to its own .cmi file if that +file is flagged as recursive. + +#### Odoc support + +This proposal will require some small additions to odoc but they are +pretty easy to handle. + +### Full example + +As a demonstration, here's the complete source for a library `Lib` +containing a packed functor `Pck` with two submodules `Foo` and `Bar`: + +dune +``` +(library ( + (name lib) + (modules ( + (pack ((name pck) (parameters (arg)) (modules foo bar))))))) +``` + +arg.mli + +``` +type t + +val a : t + +val b : t + +val print : t -> unit +``` + +foo.mli + +``` +type t = Arg.t list + +val mk : t -> t + +val iter : t -> (Arg.t -> unit) -> unit +``` + +foo.ml + +``` +type t = Arg.t list + +let mk t = t + +let iter = List.iter +``` + +bar.mli + +``` +val run : Foo.t -> unit +``` + +bar.ml + +``` +let run t = + Arg.print Arg.a; + Foo.iter (fun t -> Arg.print t); + Arg.print Arg.b +``` + +pck.mli (optional) + +``` +module Foo : sig + + type t + + val mk : Arg.t list -> t + +end + +module Bar : sig + + val run : Foo.t -> unit + +end +``` + +lib.ml (optional) + +``` +module Pck = Pck +``` + +Here is some code using the functor: + +``` +module Int = struct + type t = int + let a = 1 + let b = 5 + let print = print_int +end + +module L = Lib.Pck(Int) + +let () = L.Bar.run (L.Foo.mk [2; 3; 4]) +``` + +## Drawbacks + +- This builds on the `-pack` mechanism, which has somewhat gone out + of fashion. + +- Recursive modules are basically still an experimental feature. There + is some consensus that their current form should be replaced by + something else at some point even if it breaks + backwards-compatibility. That might break some uses of the + `-recursive-pack` and `-recursive` options. + +## Alternatives + +- A common alternative to this proposal at the moment is `sed`. That + has many obvious drawbacks, but does avoid adding new compilation + flags to the compiler. + +- Only providing some of the sub-proposals -- the functor and recursion parts + are fairly independent and could be considered separately. + +- Some use cases of recursive packs could be addressed by RFC #1. If + the mutual recursion is all at the type level then that should be + sufficient. From 693c0efa15aaa3e2adca705ba4b1460bd8bb24e6 Mon Sep 17 00:00:00 2001 From: Leo White Date: Fri, 25 Oct 2019 09:40:46 +0100 Subject: [PATCH 4/5] Update rfcs/functor-units-functor-packs-and-recursive-packs.md Co-Authored-By: Alain Frisch --- rfcs/functor-units-functor-packs-and-recursive-packs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/functor-units-functor-packs-and-recursive-packs.md b/rfcs/functor-units-functor-packs-and-recursive-packs.md index 7e88a92..048450a 100644 --- a/rfcs/functor-units-functor-packs-and-recursive-packs.md +++ b/rfcs/functor-units-functor-packs-and-recursive-packs.md @@ -410,6 +410,6 @@ let () = L.Bar.run (L.Foo.mk [2; 3; 4]) - Only providing some of the sub-proposals -- the functor and recursion parts are fairly independent and could be considered separately. -- Some use cases of recursive packs could be addressed by RFC #1. If +- Some use cases of recursive packs could be addressed by RFC #2. If the mutual recursion is all at the type level then that should be sufficient. From f16f2657a7743c103125dec734ebd71d0f470de5 Mon Sep 17 00:00:00 2001 From: Leo White Date: Fri, 25 Oct 2019 15:34:34 +0100 Subject: [PATCH 5/5] Add more exposition to drawbacks --- ...units-functor-packs-and-recursive-packs.md | 21 +++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/rfcs/functor-units-functor-packs-and-recursive-packs.md b/rfcs/functor-units-functor-packs-and-recursive-packs.md index 048450a..b36e08a 100644 --- a/rfcs/functor-units-functor-packs-and-recursive-packs.md +++ b/rfcs/functor-units-functor-packs-and-recursive-packs.md @@ -392,8 +392,25 @@ let () = L.Bar.run (L.Foo.mk [2; 3; 4]) ## Drawbacks -- This builds on the `-pack` mechanism, which has somewhat gone out - of fashion. +- This builds on the `-pack` mechanism, which has some drawbacks: + + 1. Touching any component triggers recompilation of all clients of + the packed module. + + 2. Linking part of a packed module forces linking the whole packed + module. + + For these reasons most existing uses of `-pack` have been replaced + by approaches based on module aliases. + + These issues are probably unavoidable for the recursive use case + anyway since the sub-modules are mutually recursive. + + For the functor use case these issues might be avoidable with some + other approach. Although it couldn't be based on module aliases, + since those approaches rely on being able to refer to *the* module + `M` but thanks to side-effects you cannot talk about *the* module + `F(X)` only *a* module `F(X)`. - Recursive modules are basically still an experimental feature. There is some consensus that their current form should be replaced by