diff --git a/notes/meetings/2019-09-05/notes.txt b/notes/meetings/2019-09-05/notes.txt new file mode 100644 index 0000000..4ce0317 --- /dev/null +++ b/notes/meetings/2019-09-05/notes.txt @@ -0,0 +1,122 @@ +Gabriel: my main proposal is, for the "rebuild the type-checker from +scratch" part, to create a repository on top of Inferno¹ that +implements a ML language (MidML, MoyenML?). We start from the nano-ML +example used in Inferno itself (as a typical example of Inferno client +program). We add important features from the OCaml type system, one +after the other, which requires adding new features to Inferno +itself -- as clear and general as possible to express the desired ML +type-inference features. This is useful to get a clean theory (of the +ML feature, of the inference support for it), looking at interactions +with other features, with a reference implementation that keeps +growing. And for some of those features we can publish papers. + +The notes below include at the end a list of features to which we +could think of during the discussion -- a decomposition of OCaml's +type system in smaller. + +¹ Inferno is François Pottier's type-inference library. I (Gabriel) +would recommend reading its implementation -- I did it from the +high-level layers to the low-level layers. It is hosted on François's +favorite open-source closed-development platform: +https://gitlab.inria.fr/fpottier/inferno . + +---- + +# Global approach + +We should do "incremental" and "from scratch" ("big bang") in +parallel, with communication between both. + + +Incrémental + +- Start from the TODO list and Thomas' and Leo's ideas + + Jacques: I'm less interested in that part. + + Gabriel: but we will ask you to review what people have done. + + Jacques: I was mostly speaking of the refactoring. Conceptual + changes (like adding scopes) are more interesting to me. + + Gabriel: I think sometimes conceptual changes emerge + from refactoring, when done carefully. + + +Big-bang + +- Use Inferno + MoyenML as an experimental vehicle to study the essence + of OCaml type-system features, layer by layer. + + How to bridge the gap between MoyenML-gone-big and OCaml? One + possibly, if required (it's not pretty), would be to replace the + lowest layers of Inferno (Union-Find, levels/generalization etc.) by + the OCaml type representation and its manipulation code + (atype/btype/ctype). + + (Didier: the interface of Inferno is nice, but in the implementation + I'm not convinced by the very-effectful approach of François. Maybe + we don't want that on the long term, it may not scale to more + advanced constraints.) + +Core language: what's an explicitly-typed representation that the +OCaml type-inference engine could produce as result? + + Jacques: about that, I was thinking of using CIC (Calculus of + Inductive Constructions), to represent the values of OCaml's + intermediate representation with dependent types. -- use Gallina as + intermediate representation. + + +## Feature map to add to Inferno + MidML + +### Term language + +- [data]: algebraic datatypes + + pattern-matching + +- type abbreviations + +- label overloading (constructors and fields) + (depends on [data]) + + Gabriel: http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt + +- scoped type variables + +- first-class polymorphism + +- GADTs + (depends on [data]) + +- structural rows + (objects, polymorphic variants) + +- checking .mli + (the inclusion check between type declarations) + +- value restriction? + + partial generalization + + relaxed value restriction (variance) + +### Error report (entry later added by DR) + +- minimal error report needed + +- explore other solutions + + Nathanael proposal: report unification errors in terms of typing + constraints from the soruce. + +### Propagation of source typing annotations (later added by DR) + +- Preprocessing step? + +- During constraint generation? + +- Other: new form of constraints? + +### Modules + +- modules +- first-class modules + diff --git a/notes/meetings/2019-09-16/notes.txt b/notes/meetings/2019-09-16/notes.txt new file mode 100644 index 0000000..41d7321 --- /dev/null +++ b/notes/meetings/2019-09-16/notes.txt @@ -0,0 +1,60 @@ +Discussion of the "high road" (Xavier's term): +* rebuilding something like OCaml (possibly simpler, cleaner) + from scratch, starting from Inferno. + +Discussion of the "low road": +* refactoring the existing type-checker +* problem: the reviewing load will be high (especially for Jacques) +* Jacques says it's ok to discuss the high-level ideas, + but painful/impossible to review the code itself +* Xavier wants discussion on high-level ideas, more than one the code + +Didier asks whether the two roads are supposed to meet eventually. + +Jacques says that Thomas Refis is the one who is currently working most +on the "low road" (the type-checker). E.g. Thomas has clarified the notion +of "scope", etc. Also, Leo White has done some work on the type-checking +of objects and classes (not yet reviewed and merged; it was discussed in +March 2019 but the requested follow-up work was not done). + +François asks whether it would be worthwhile to work on strengthening +the test suite. +* Xavier mentions fuzzing; + it could be very useful if the type-checker contains assertions +* Florian & Sébastien had an Outreachy intern (Oxana) + who worked on checking the coverage of the test suite. + It would be good to continue and finalize this work. + (Gabriel had an Outreachy intern (Ulugbek) working on quickchecking + the compiler by randomly generating well-typed programs. Could also + be of interest.) + +Didier asks whether the code should be commented (either inside the code, +or in a separate file containing explanations). Damien notes that some PRs +begin with a long explanation. But it is not visible in a cloned repository! + +Sébastien asks where we should invest our energy and expertise. (Who should +work on what?) E.g. Sébastien might be interested in working on refactoring +the type-checker. + +What are our forces? +* Gabriel Radanne +* Leo White? +* Thomas Refis??? (will be doing a PhD with Didier, hence *scientific work*) +* Sébastien could have some time after March 2020 + +Someone mentions the technology that allows rapidly rebuilding all opam packages so as to test a change. Do we master it? Florian knows about "opam-check" (Damien's tool); works, but not yet parallel. (It should be!) Armaël & Gabriel have worked on another approach (also on top of opam) which is parallel. Still experimental, to be finished. And what about the OCaml Labs tools? Damien notes that many opam packages have broken dependencies, which complicates things. (Would it be easy to automatically check that no dependencies are missing in a package description?) (Damien notes that it would be useful to know which modules are installed by which packages.) + +Didier asks to what extent it is permitted to change the type system (intentionally). E.g. we might wish to change the treatment of first-class polymorphism. (Didier has the "high road" in mind.) There seems to be +consensus that in the high road it is permitted & desirable to do so. + +Some discussion of -principal, which is needed to infer principal types in the presence of first-class polymorphism, but is costly because it imposes copying instead of sharing. + +Sébastien notes the need to organize another meeting, including developers outside Inria, and to keep everyone in the loop. Need to do so soon, and decide (during that meeting) who will be working on each road. + +Some discussion of the idea of introducing one or more explicitly-typed intermediate languages. (Integrate the work by Pierrick Couderc?) It or they should be easy to type-check; the code of the checker should be close to a declarative specification. Jacques would like this language to be formally defined in Coq. Armaël notes that one could even have a type-checker for it implemented in Coq. (Analogous to what the F* people did?) Jacques and Xavier would like to do a proof of type soundness (preferably for a low-level calculus). A model of OCaml types in Iris could be interesting. Xavier wants types to have meaning. (This is the "high high road".) + +TODO: + Send a proposal for an open meeting. + Scheduled together with a regular developers meeting? + (Include Pierrick + Gabriel.) + Propose a 2-day meeting. Starting late, finishing early. Research + dev. diff --git a/notes/meetings/2019-10-07/inferno-roadmap.txt b/notes/meetings/2019-10-07/inferno-roadmap.txt new file mode 100644 index 0000000..365b759 --- /dev/null +++ b/notes/meetings/2019-10-07/inferno-roadmap.txt @@ -0,0 +1,127 @@ +An Inferno roadmap (also known as "the high road"). +I will also commit a version of this to the Inferno repository. +-- FP + +### Universal quantification in constraints + +### Type annotations + +This includes "scoped type variables", that is, the ability (or necessity?) +to explicitly bind type variables. + +It should be possible to bind a type variable either existentially or universally. + +As a side product, we get the ability of comparing an inferred type +with a declared type. + +### Algebraic data types and pattern-matching + +### GADTs + +Need to reason under local equations between types. +Need to detect ambivalent types. +In the presence of modules and abstract types, + need a notion of "compatibility" (an approximation of equality). + +### Type abbreviations + +Toplevel or local? +Recursive? +The subject of Carine Morel's recent internship. + +### Principality, sharing + +Understand how type information is shared internally and is propagated. +Related to GADTs, first-class polymorphism, type abbreviations. + +### Propagation of type information (synchronization) + +Can it be viewed as a preprocessing step? +Can it be performed at the same time as constraint generation? +Do we need new forms of constraints? + (somehow allowing feedback from constraint resolution + into constraint generation) +What is the role of type annotations in this setting? + +### Overloading of data constructors and field labels + +See Gabriel's proposal: + http://gallium.inria.fr/~scherer/topics/delayed-constraints-for-disambiguation.txt +Related to implicit arguments. +Related to name resolution. +Related to synchronization. + +### Name resolution + +Make it easier for tools (smart editor, refactoring tool) + to understand the binding structure of the code. + +### First-class polymorphism + +For records and objects. Do we want to redesign this feature? + Can we avoid the syntactic overhead of boxing them in a record? + +### Rows + +With applications to objects and polymorphic variants and effects. + (Currently, objects and variants use local constraints.) + +### Core language subtyping + +Implicit, explicit, what form? + Are coercions erased? (yes) +Related to rows. + +Implicit (pervasive) subtyping would require bi-unification (Stephen's algorithm). + +### Value restriction + +The relaxed value restriction, based on variance. +Also, see Jacques's more recent proposal? + +### Basic support for linear types, regions, borrows + +Based on recent work by Gabriel Radanne and Peter Thiemann. + +### Type error reports + +Can we produce error reports that are minimal in a certain sense? + What is the algorithmic cost? + +Can we report multiple errors? + +### Incrementality, type-inferencer API + +Can we support incremental type inference? +Can we avoid global state in the implementation? + or avoid side effects altogether? +Should we expose an abstract type of type-inferencer states? +Recomputing as little as possible when the source code changes. + +The "toplevel" (REPL) interface. The debugger. + How do they interact with the type-checker? + +### Modules + +Including first-class modules. +Should the module language and the core language be separate? + +### Implicit arguments + +### Labeled and optional arguments + +Needs a form of multi-application. +Related to synchronization (propagation). + +### Classes + +What kind of support does this require in Inferno? + +### Kinds + +Do we need some kind of support for kinds? +How would it impact the solver's API? +Type application in the syntax of kinds? + How are equations between type applications solved? +Subkinding? +Should type variables and type constructors be a single category? diff --git a/notes/meetings/2019-10-07/low-road-notes.txt b/notes/meetings/2019-10-07/low-road-notes.txt new file mode 100644 index 0000000..54ee7de --- /dev/null +++ b/notes/meetings/2019-10-07/low-road-notes.txt @@ -0,0 +1,31 @@ +Gabriel Radanne will start doing more code reviews now. + +Reviews: hmm. + + It would be nice if all Inferno-working people could help on code + reviews, not only within their subgroups. + + (the idea of in-person presentations for big PRs: Leo is not in + favor, unconvinced by the idea of travelling to Paris so often (too + many big PRs?)) + +Low road: the point is that making the code easier to understand + is a compiler-distribution feature by itself. + +Leo: larger rewrites should get RFCs just as big high-road features. + +Testing? + - Leo: Jane Street will keep building a branch of their codebase in trunk + - opamcheck: if we suspect there is a regression risk, we try to convince + Florian to run opamcheck on it + +High-road testing? Can we collect important examples of tricky OCaml +code that relies on sharing in deep ways? + +Pierre Chambart: if we had a shared typed IR it would be easier for testing. + +Testing: collect examples of things that do not work in the testsuite, +corresponding to known defects. "known-to-break" testsuite directory? + + Policy: Everytime there is a new bug report with a small repro-test, + it should go into this known-defects directory. diff --git a/notes/meetings/2019-10-07/notes.txt b/notes/meetings/2019-10-07/notes.txt new file mode 100644 index 0000000..d36d319 --- /dev/null +++ b/notes/meetings/2019-10-07/notes.txt @@ -0,0 +1,274 @@ +Redesign of the typechecker, +Monday October 7, in Paris +Minutes of the meeting + +### Participants + +See the file participants.md, + +Gabriel Scherer (GS), François Pottier (FP), Didier Remy (DR), Jacques +Garrigue (JG), Thomas Refis (RF), Xavier Leroy (XL), Leo White (LW), Armaël +Guéneau (AG), Florian Angeletti (FA), Alain Frisch (AF), Nicolás Ojeda Bär +(NOB), Pierrick Couderc PC), Frédéric Bour (FB), Gabriel Radanne (GR), +Damien Doligez (DD), Anil Madhavapeddy (AM). + + +### Why do we want to do this? + +This should be kept in mind while doing the redesign! + +# Robustness + +# For easier maintenance + +# To be more suitable for extensions + +- Including immediate extensions in mind: implicits and algebraic effects +- But also future extensions + +# More precise specification + +- including a reference implementation + +# Formalization of the kernel + +- On paper, or perhaps in Coq + +# Tooling + +- Including better support for errors + +# Better performance + +- Typechecker is actually slow (for the module system) +- There might be opportunities for optimizations +- Which are hard to do in the current state of the implementation + + +### Risks + +AF: is worried about backward compatibility, and ask whether this +should be the same language or another language. + +DD: there is a continuum between the two extremes: exactly the same language +or another language. + +LW: keeping it exactly the same would prevent us from fixing bugs. + +Someone mentions getting rid of the object system altogether... + +XL: A quote from Guy Steele: when the community wanted to simplify + JavaScript, they observed that it had already too many users. + + XL would not like OCaml to turn into JavaScript. + +XL thinks that OCaml has too many bug reports. If it were to be redone, some + features would not be there (e.g. recursive modules) or would have been + done otherwise. So he supports a clean up during the redesign. + +In conclusion, we somehow all wish it to be the next version of the same +language, and not another language, but for the majority it is clear that we +will also do some changes and that it is too early to say which ones. + +We discuss the time scale. DR suggests a target of 3 years. Even though we +understand that it could take longer to make the change more continuous for +users. AM reminds that for Multicore OCaml it has been 7 years, where most +of the ideas and design was done in the first 3 years. He recommends that we +do not communicate on the redesign of the typechecker in the community, so +as not to create expectations. + +### How to work together + +# GitHub + +LW proposes to use git for everything in the model of RFC (see for example +the discussed on unboxed types), where a proposal can be described and +commented. + +For more research-like topics, we could also open sub-directories and push +drafts of the research project and updates them at significant steps. + +There is an agreement that GitHub is a good tool to keep each other informed +of the actions going on and of their progress. + +We should do this in a private branch, since we do not want to communicate +publicly on the redesign of the typechecker... + +# Meetings + +How often should we plan meetings? This is unclear: we are probably going to +work and perhaps meet in small groups around specific precise tasks. Hence, +we would not need to meet altogether too often, perhaps twice a year to keep +us informed of the advances of the different projects. + +We could also use video conf, especially for some small group meetings. + +We should plan meeting more longer ahead (2 months) to make it easier to +find a slot that is convenient for almost all of us. We are invited to host +one of the meetings in London. + + +### Typed Intermediate Language + +We all agree on the need for a Surface Typed Intermediate Language (S-TIL) +that is closed to the Source language and the output of type inference. + +There should also be a smaller Kernel Typed Intermediate Language, but it +is unclear at this point what how far from the surface it should be: + + - It should at least remove all syntactic sugar. + + - But it should not complicate the type system just for the sake of + reducing the size of kernel. + +We should be aware of over previous experiments that ended up in using Coq +as the internal language to be able to encode everything else (cf. Zhong +Shao). Is this what we want? He encourages us to look at previous works. + +JG mentions that perhaps a compilation to Coq could be used +as a way to prove type soundness. + +Xavier wants to avoid type preserving compilation, which is difficult---even +for inlining it's not easy. + +The goal of having a Typed Intermediate Language is not for compiling but +for having a smaller language to formalize (soundness proof). We may then +remove types (or move to simpler types) for the rest of the compilation. + + +### High-level roadmap + +This describes the informal discussion we had before lunch, See the list of +high-level tasks below that we agreed on after lunch. + +This discussion was a little disorganized and is difficult to summarize. +The main points are the following. + +# Concurrency (later called synchronization) + +- Inference will have to be "concurrent" for implicits, because of the + interaction between the core level and the module level. + Just for HO unification, ie. pattern unification, this is the way it + works. + +- FP reminds that the constraints will need a clear specification. + +- Type inference is also needed for disambiguation + +- It is unclear that we could separate constraint generation and constraint + resolution, because of name disambiguation. + +# principal mode? + +- In OCaml, type inference is not principal, even in principal mode! + It is only maximal, i.e. the typechecker never take an arbitrary step. + +- There is a form of "local type inference" (e.g. for labeled arguments) + +# GADTs + +- Nested patterns with GADTs are quite tricky. (Later, at lunch, we could + not even agree on whether compiling GADT with nested patterns + into flat patterns would preserve well-typedness) + +# Difficult parts are + +- sharing + +- propagation of annotations/types + + + +### High-level tasks + +# S-TIL (Surface Typed Intermediate Language) + +Participants: DR, JG, TR. + +==> Looking for a leader + +(We forgot to come back at this task at the end of the discussion...) + +- One argument was whether we could define S-TIL now or have to wait for + advances on other tasks... + + +# Sharing + +Participants: DR (leading), DP, GR, JG, LW + +- This refers to all forms of sharing in types: type abbreviations, sharing + for object types, sharing for polytypes, type equalities for module + types. + +- Subtopics are: principality, polytypes. + + +# Synchronization + +Participants: DR (leading), DP, GR, JG, LW, GS (name resolution) + +- This refers to the propagation of information learned during type + inference: propagation of source type annotations, but also using the type + inferred for a function to help type the arguments or propagate the type a + a let-binding as we would propagate its type annotation if it had one. + +- This will introduce a form of synchronization during constraint solving. + +- We also need to treat constraints concurrently, to solve them in an order + that cannot be determined in advance. + +- Subtopics are also: principality, polytypes. + +- Name resolution should fit there as well. + + +# Integration between Core & Module languages + +Participants: GR (leading), DR, JG, LW (type inference only) + +Two subtopics that can be looked at separately are + + - type inference + + - type checking + + +# Tooling & Incrementality + +Participants: GR (leading), FB, TR, FA, FP, DR, GS + +- Type errors + +- Keeping enough information in the S-TIL for better tooling + +- A lot of engineering here + + + Isolating the sate of the typechecker (i.e. we should be able to run + multiple instances of the typechecker) + + + toplevel interface + + + loading of global modules + + + debugging tools + + +# Rows, Subtyping & Effects: LW (leading), DR, FP, JG + + +# GADTs: JG (leading), DR, FP + + +# Labeled arguments: JG (leading), DR, LW + + +# Classes & Objects: JG, LW, DR + +==> Who is leading? JG? + +# Kind system: DR, FP, GR, LW, SD + +- Including subkinding + +==> Who is leading? diff --git a/notes/meetings/2019-10-07/participants.md b/notes/meetings/2019-10-07/participants.md new file mode 100644 index 0000000..0c0c96c --- /dev/null +++ b/notes/meetings/2019-10-07/participants.md @@ -0,0 +1,40 @@ +List of participants + +* Sébastien, with vegetarian lunch + +* Gabriel Scherer, skipping lunch. + +* François Pottier, present at lunch (no requirements). + I will have to attend another meeting between 10am and 11:30am, + but otherwise will be available. + +* Didier Remy, present at lunch if at Il goto (or with a no sodium option) + +* Jacques Garrigue, with lunch + +* Thomas Refis, present at lunch if at Il goto (vegetarian options have been + disappointing at other places) + +* Xavier Leroy, with lunch (no strong requirements). + +* Leo White, with lunch + +* Armaël Guéneau, with lunch (preference for vegetarian lunch unless at + The Touareg) + +* Florian Angeletti, with lunch (no requirements) + +* Alain Frisch, with lunch (preference for vegetarian lunch unless at + The Touareg) + +* Nicolás Ojeda Bär, with lunch (no strong requirements). + +* Pierrick Couderc, with lunch (no requirements) + +* Frédéric Bour, without lunch + +* Gabriel Radanne, with lunch (no requirements) + +* Damien Doligez, with lunch (no requirements) + +* Anil Madhavapeddy, with lunch (no requirements) diff --git a/notes/meetings/2019-10-07/schedule.md b/notes/meetings/2019-10-07/schedule.md new file mode 100644 index 0000000..6c81a69 --- /dev/null +++ b/notes/meetings/2019-10-07/schedule.md @@ -0,0 +1,26 @@ +Tentative schedule for this research meeting + +Suggestions by Jacques: +* discuss the framework for changes in the typechecker / type system, particularly whether to go for a low road / high road approach (see notes of previous meeting) +* technical discussions on the type system and its implementation + +Suggestions by François: + +* discuss the low road (trying to set up a concrete lists of things + to do and people who can do them) + +* discuss the high road (with similar aims) + +* discuss the idea of introducing one or more explicitly typed + languages (equipped with reference type-checkers that perform + no type inference) + +Suggestions by Leo: + +* Discuss the aims of the rewriting project + +* Discuss how we want to collaborate on the typechecker, i.e. how should we communicate and how should we make decisions + +* Discuss how we are going to track/report progress on the various lines of work + +* Try to write down some kind of plan/schedule of work diff --git a/rfcs/forward_type_definitions.md b/rfcs/forward_type_definitions.md new file mode 100755 index 0000000..a092e41 --- /dev/null +++ b/rfcs/forward_type_definitions.md @@ -0,0 +1,54 @@ +# The problem + +OCaml's module system does not allow splitting mutually recursive type +definitions into several compilation units. This encourage to put all +definitions in the same place, exposing all structural +definitions. One would sometimes prefer to have multiple modules, each +exposing one type, perhaps in an abstract or private from (together +with functions operating on that type). + +This RFC attempts to propose a pragmatic work-around for this +situation, taking inspiration from the widely used work-around for the +similar problem in the core language (references to implement forward +references to functions across compiltation units). + + +# Proposed solution + +- Extend type expressions with "global type-constructors": + + ('a1, ..., 'an) "foobar" + + where "foobar" is an arbitrary string literal (or restricted in some way). + +- Such type constructors behave as purely abstract ones (unification/equality + on arguments; equality on the name). + +- The type-checker checks that in a given compilation unit, a given + name is never used with two different arities, and that mapping + from name to arity is stored in .cmi files. Whenever a .cmi file + is loaded, a check is performed on arities (between two .cmi files, + and between an external .cmi file and the current unit). + +- An implementation can define a global type-constructor with a concrete + definition (a type expression), with a syntax to be determined, let's say: + + type ('a1, ..., 'an) "foobar" := + + This is not allowed in local modules or inside a functor's body, + only at toplevel and in nested modules. + + The rest of the compilation unit is type-checked with an environment + that remember how to unroll the "foobar" global type-constructor. + +- To be discussed: can we expose such definitions in the module interface + and "import" them from external units? + +- .cmo and .cmx files remember which global type-constructors are defined + in the corresponding implementation, and the linker checks that at most + one implementation linked into the program provides the definition + for each name. + +- We arrange so that Dynlink enforces a similar invariant: the main program + could use an undefined global type-constructor, whose definition is + provided at runtime by some dynlinked unit (but only one!).