OpenFreeEnergy · ethanholz · Jan 21, 2026 · Jan 21, 2026 · Jan 21, 2026 · Jan 21, 2026
diff --git a/docs/concepts/context.rst b/docs/concepts/context.rst
@@ -0,0 +1,172 @@
+``ProtocolUnit`` execution ``Context``
+======================================
+
+:class:`.Context` instances carry the execution environment for individual :class:`.ProtocolUnit` executions.
+They are created by the execution engine just before a unit is excuted and discarded once the unit returns.
+The class acts as a thin wrapper around two :class:`.StorageManager` objects (shared and permanent) and a scratch directory.
+
+
+Why Context exists
+------------------
+
+``ProtocolUnit`` code frequently needs a few shared facilities:
+
+``scratch``
+    A temporary directory that the unit can freely write to while it runs.
+    Files written here are considered ephemeral; the engine may delete them as
+    soon as the unit finishes.
+
+``shared``
+    A :class:`.StorageManager` backed by
+    short–lived :class:`.ExternalStorage`. Use
+    this to hand large files to downstream units without serializing the
+    payloads through Python return values.
+
+``permanent``
+    Another :class:`.StorageManager` targeting long–term storage. Results saved here
+    survive beyond the life of the :class:`.ProtocolDAG` run (for example for
+    inspection or for reuse in future extensions).
+
+``stdout`` / ``stderr``
+    Optional directories where the engine captures subprocess output triggered
+    by the ``ProtocolUnit``.  The directories are removed automatically when the context
+    closes.
+
+Keeping these handles bundled together and managed by a context manager lets
+``ProtocolUnit`` implementers focus on domain logic while the engine ensures
+storage gets flushed and temporary directories are cleaned.
+
+
+Lifecycle
+---------
+
+``Context`` implements a `Python context manager`_.
+When the context is exited, ``shared`` and ``permanent`` storage managers flushed tracked files back to their underlying :class:``ExternalStorage``.
+Any ``stdout`` or ``stderr`` capture directories are also removed.
+
+.. _Python context manager: https://docs.python.org/3/reference/datamodel.html#context-managers
+
+This means each ``ProtocolUnit``'s ``shared`` and ``permanent`` object are not paths, and should not be treated as such.
+Both of these are registries that track if a file should be transferred from its location in ``scratch`` to its final location after completing a unit.
+
+If you want to use some from ``shared`` or ``permanent``, you can use ``ctx.shared.load`` or ``ctx.permanent.load``.
-If you want to use some from ``shared`` or ``permanent``, you can use ``ctx.shared.load`` or ``ctx.permanent.load``.
+To access data from ``shared`` or ``permanent``, you can use ``ctx.shared.load`` or ``ctx.permanent.load``.
-If you want to use some from ``shared`` or ``permanent``, you can use ``ctx.shared.load`` or ``ctx.permanent.load``.
+To access data from ``shared`` or ``permanent``, you can use ``ctx.shared.load`` or ``ctx.permanent.load``.
+This will allow your unit to fetch those objects from their storage for use.
+
+
+Using Context inside ProtocolUnits
+----------------------------------
+
+Every :meth:`.ProtocolUnit._execute` definition must accept ``ctx: Context`` as its
+first argument.  Typical usage looks like the example below.
+
+.. code-block:: python
+
+    from gufe import ProtocolUnit, Context
+
+    class SimulationUnit(ProtocolUnit):
+
+        @staticmethod
+        def _execute(ctx: Context, *, setup_result, lambda_window, settings):
+            scratch_path = ctx.scratch / f"lambda_{lambda_window}"
+            scratch_path.mkdir(exist_ok=True)
+
+            # Read upstream artifacts from ctx.shared
+            system_file = ctx.shared.load(setup_result.outputs["system_file"])
+            topology_file = ctx.shared.load(setup_result.outputs["topology_file"])
+
+
+            result_path = ctx.scratch / "some_output.pdb"
+            # When you register the filename doesn't matter,
+            # just as long as you do it before you return
+            result_path_final_location = ctx.permanent.register(result_path)
+            # This is an example of running something that you want to save
+            simulate(output=result_path)
+
+            # Return only lightweight metadata
+            return {
+                "lambda_window": lambda_window,
+                # We use this because it is already namespaced and can be used between units.
+                "result_path": result_path_final_location,
+            }
+
+The example above showcases how
+
+
+Choosing between shared and permanent storage
+---------------------------------------------
+
+Both ``ctx.shared`` and ``ctx.permanent`` expose the same :class:`.StorageManager`
+API but they serve different audiences:
+
+``ctx.shared``
+    Optimized for communication between units in the same DAG execution.  The
+    execution backend is free to prune these assets once no downstream unit
+    references them.
+
+``ctx.permanent``
+    Intended for outputs that should survive beyond the immediate DAG, such as
+    user-facing reports or artifacts that will seed future runs.
+
+As a rule of thumb, prefer ``ctx.shared`` unless you have a clear requirement
+to keep the data after the ``Protocol`` run concludes.  Small scalar values or
+lightweight metadata should still be returned directly from ``_execute`` so
+they become part of the ``ProtocolUnitResult`` record.
+
+
+Interaction with Protocols
+--------------------------
+
+``Protocol`` instances do not instantiate ``Context`` directly; they declare ``ProtocolUnit`` objects via ``Protocol._create``.
+When an execution backend walks the resulting
+``ProtocolDAG`` it constructs a ``Context`` for each unit using the DAG label, unit label, scratch directory, and the configured ``ExternalStorage`` implementations.
+The backend might provide different ``ExternalStorage`` implementations (e.g., local filesystem, object store, cluster scratch) depending on where the work runs, but the ``Context`` API seen by ``ProtocolUnit`` authors stays consistent.
+
+Because the execution backend is in charge of creating the contexts, protocol authors can rely on ``ctx`` always being populated with valid storage managers and paths that are safe to write to from distributed workers.
+
+
+Migrating from legacy Context usage
+-----------------------------------
+
+Before ``Context`` was rewritten, it was a simple data class with two ``pathlib.Path`` handles: ``scratch`` and ``shared``.
+Existing protocols adopted a variety of implicit conventions around those attributes.
+Follow this checklist when migrating old protocols:
+
+1. **Swap file paths to StorageManager APIs.**  Calls like ``ctx.shared /
+   "filename"`` should be replaced with the helper methods offered by
+   :class:`StorageManager`.  For example:
+
+.. code-block:: python
+
+    path = ctx.shared.scratch_dir / "myfile.dat"
+
+becomes:
+
+.. code-block:: python
+
+    ctx.shared.register("myfile.dat")
+
+
+2. **Avoid storing heavy objects in Python outputs.**  Older protocols often
+   returned raw ``Path`` objects pointing at scratch files.  Instead, register
+   the file with the storage manager and return the storage key (a string) from
+   ``_execute`` as shown in the example above.  Downstream units can then call
+   :meth:`.StorageManager.load`.
+
+3. **Handle ctx.permanent.**  There was no equivalent in the legacy API.
+   Decide which results must persist between DAG executions and write them via
+   ``ctx.permanent``.  For migration you can start by mirroring whatever used
+   to live in ``ctx.shared`` and refine later.
+
+4. **Expect automatic cleanup.**  Old contexts typically left stdout/stderr
+   directories around.  The new context removes these when the unit finishes.
+   If your code tried to re-read the capture directories after ``_execute``
+   returned, move that logic earlier or rely on the logged data captured in the
+   ``ProtocolUnitResult``.
+
+5. **Stop constructing Context manually.**
+   Some bespoke execution scripts once instantiated ``Context(scratch=..., shared=...)`` by hand.
+   That pattern is obsolete because the constructor now requires ``ExternalStorage`` objects.
+   Instead, rely on the execution backend to build contexts.
+   For unit tests use the helpers in ``gufe.storage.externalresource`` (e.g., ``MemoryStorage``) to create the necessary storage instances.
+
+If you need more information on how to use these concepts, checkout out: :doc:`../how-tos/protocol`.
diff --git a/docs/concepts/index.rst b/docs/concepts/index.rst
@@ -12,4 +12,6 @@ utilize **gufe** APIs.
    tokenizables
    included_models
    serialization
+   storage
+   context
    logging
diff --git a/docs/concepts/storage.rst b/docs/concepts/storage.rst
@@ -0,0 +1,213 @@
+.. _concepts-storage:
+
+How storage is handled in **gufe**
+==================================
+
+**gufe** abstracts storage into a reusable interface using the :class:`.ExternalStorage` abstract base class.
+This abstraction enables the storage of any file or byte stream using various storage backends without changing application code.
+
+Overview
+--------
+
+The storage system is designed to handle files (or byte data) that need to be stored in some location.
+Instead of embedding the data, objects can store a reference (a unique string indicating the object's location, such as a path) to where the data is stored externally. This approach provides several benefits:
+
+* **Efficiency**: Large objects don't need to be serialized multiple times
+* **Flexibility**: Different storage backends (local filesystem, cloud storage, in-memory) can be used interchangeably
+* **Deduplication**: The same data can be referenced by multiple objects
+* **Lazy Loading**: Data is only loaded when needed
+
+The Storage Architecture
+-------------------------
+
+The storage system consists of several key components:
+
+``ExternalStorage`` Base Class
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The :class:`.ExternalStorage` abstract base class defines the interface that all storage implementations must provide. This class provides:
+
+* **Store operations**: ``store_bytes()`` and ``store_path()`` to store data
+* **Load operations**: ``load_stream()`` to retrieve data as a stream
+* **Management**: ``exists()``, ``delete()``, and ``iter_contents()`` for managing stored data
+
+All storage operations use a *location* string as an identifier for the stored data.
+
+Storage Implementations
+-----------------------
+
+**gufe** provides several built-in implementations of :class:`.ExternalStorage`:
+
+``FileStorage``
+~~~~~~~~~~~~~~~
+
+The :class:`.FileStorage` implementation stores data on the local filesystem. It requires a root directory path and organizes stored files using the location string as a relative path:
+
+.. code-block:: python
+
+    from pathlib import Path
+    from gufe.storage.externalresource import FileStorage
+
+    # Create a file storage backend
+    storage = FileStorage(root_dir=Path("/path/to/storage"))
+
+    # Store some data
+    data = b"Hello, World!"
+    storage.store_bytes("datasets/sample1.txt", data)
+
+    # Check if data exists
+    if storage.exists("datasets/sample1.txt"):
+        # Load the data
+        with storage.load_stream("datasets/sample1.txt") as stream:
+            loaded_data = stream.read()
+            assert loaded_data == data
+
+    # Delete the data
+    storage.delete("datasets/sample1.txt")
+
+``FileStorage`` automatically creates any necessary parent directories when storing files.
+
+``MemoryStorage``
+~~~~~~~~~~~~~~~~~
+
+The :class:`.MemoryStorage` implementation stores data in a Python dictionary. This is primarily useful for testing and prototyping:
+
+.. code-block:: python
+
+    from gufe.storage.externalresource import MemoryStorage
+
+    # Create an in-memory storage backend
+    storage = MemoryStorage()
+
+    # Store some data
+    data = b"Hello, World!"
+    storage.store_bytes("datasets/sample1.txt", data)
+
+    # Load the data back
+    with storage.load_stream("datasets/sample1.txt") as stream:
+        loaded_data = stream.read()
+
+.. warning::
+    ``MemoryStorage`` is not intended for production use and all data is lost when the Python process exits.
+
+Implementing Custom Storage Backends
+-------------------------------------
+
+To create a custom storage backend, subclass :class:`.ExternalStorage` and implement all the abstract methods:
+
+.. code-block:: python
+
+    from gufe.storage.externalresource.base import ExternalStorage
+    from typing import ContextManager
+
+    class MyCustomStorage(ExternalStorage):
+        """A custom storage implementation."""
+
+        def _store_bytes(self, location: str, byte_data: bytes):
+            """Store bytes at the given location."""
+            # Implement storage logic
+            pass
+
+        def _store_path(self, location: str, path):
+            """Store a file at the given path."""
+            # Implement storage logic
+            pass
+
+        def _load_stream(self, location: str) -> ContextManager:
+            """Return a context manager that yields a bytes-like object."""
+            # Implement loading logic
+            pass
+
+        def _exists(self, location: str) -> bool:
+            """Check if data exists at the location."""
+            # Implement existence check
+            pass
+
+        def _delete(self, location: str):
+            """Delete data at the location."""
+            # Implement deletion logic
+            pass
+
+        def _get_filename(self, location: str) -> str:
+            """Return a filename for the location."""
+            # Implement filename generation
+            pass
+
+        def _iter_contents(self, prefix: str = ""):
+            """Iterate over stored locations matching the prefix."""
+            # Implement iteration logic
+            pass
+
+        def _get_hexdigest(self, location: str) -> str:
+            """Return MD5 hexdigest of the data (optional override)."""
+            # Can override for performance improvements
+            pass
+
+.. note::
+    All storage methods should be blocking operations, even if the underlying storage backend supports asynchronous operations.
+
+StorageManager
+--------------
+
+The :class:`.StorageManager` class provides a higher-level interface for managing storage operations within a computational workflow.
+.. note::
+
+    ``StorageManager`` is largely used by the :class:`.Context` class and should not be instantiated in protocols.
+    In general, protocol developers will only use the ``register`` and ``load`` functions.
+
+It handles the transfer of files between a scratch directory and external storage (such as shared or permanent storage):
+
+.. code-block:: python
+
+    from pathlib import Path
+    from gufe.storage import StorageManager
+    from gufe.storage.externalresource import FileStorage
+
+    # Set up storage
+    storage = FileStorage(root_dir=Path("/path/to/storage"))
+    scratch_dir = Path("/path/to/scratch")
+
+    # Create a storage manager for a specific DAG and unit
+    manager = StorageManager(
+        scratch_dir=scratch_dir,
+        storage=storage,
+        dag_label="my_experiment",
+        unit_label="transformation_1"
+    )
+
+    # Register files for later transfer
+    out = manager.register("trajectory.dcd")
+    out2 = manager.register("results.json")
+    # Note: out and out2 are pre-namespaced values that allow storage items to be passed around
+
+    # Transfer all registered files to external storage
+    manager._transfer()
+
+    # Load files from external storage
+    trajectory_data = manager.load(out)
+    results_json = manager.load(out2)
+
+The ``StorageManager`` uses a namespace combining the ``dag_label`` and ``unit_label`` to organize files in the external storage backend.
+To see how these work in practice see our documentation on :doc:`context`.
+
+
+Error Handling
+--------------
+
+The storage system defines several exceptions in :mod:`gufe.storage.errors`:
+
+* :class:`.ExternalResourceError`: Base class for storage-related errors
+* :class:`.MissingExternalResourceError`: Raised when attempting to access non-existent data
+* :class:`.ChangedExternalResourceError`: Raised when metadata verification fails
+
+These exceptions can be caught and handled appropriately in application code:
+
+.. code-block:: python
+
+    from gufe.storage.errors import MissingExternalResourceError
+
+    try:
+        with storage.load_stream("nonexistent_file.txt") as stream:
+            data = stream.read()
+    except MissingExternalResourceError:
+        print("File not found in storage")