Refactor the `services` module

The `services` module was intended to loosely follow the **Dependency Injection** pattern, where the `services` module itself would serve as the container, and providers for services could be registered, so that dependencies of the framework could always be imported neatly, and only once they're used would start using them they would resolve themselves from the registered providers. The code now is kinda messy and I don't think the intended use cases are entirely covered, and the semantics are quite unclear.

* Why does it even exist? Because it wants to make sure the framework/user-code can safely import all its dependencies at import time. There are 2 tricky cases when that was usually impossible to do at import time:
  * Optional dependencies brought in my sub-features that not all users would have installed -> resolved mostly by the plugin pattern we have now, for example `nest` is no longer an optional dependency of the framework (`bsb-core`), but is used only within the plugin `bsb-nest`; 1 case remains which is `MPI`.
  * Dependencies that could differ based on user needs, for which a user might want to choose their own provider. For example we parallelize job distribution over `mpipool` but one might want to use `multiprocessing` or something else (although at this point I think the `JobPool` relies on too many implementation details of the `MPIExecutor` to unravel those); the lock service could also use something else then `MPILock`, etc ...

In order to be able to have a stable import, but lazy-load the implementation based on user choices like this, the entire `services` module tries to proxy everything, and can't be used at import time. I'm not sure this has been a great design choice; it's how the dependency injection pattern works, but those patterns are always bound to their container, and do not work globally at import time. So to address our need I suggest we break away from that part of the pattern. We can actually simplify alot if we drop that. The reason it exists in DI is because it makes it easy to swap out providers on different contexts, like testing. Since we're dealing with things that are only environment specific, and not application context specific, it doesn't even add any benefits (whether we need to use `mpi4py` depends only on what's installed on the machine, and doesn't differ between compiling or testing) so all our providers can be very easily resolved at import time and doesn't require a complicated DI provider hierarchy.

So if we drop that part of what inspired the `bsb.services` system, we can simplify it a lot.

## The general idea

The framework defines a set of services: services are submodules to the `bsb.services` module, and represent an abstraction around a package dependency. When the `bsb.services` module is constructed, it resolves the providers the user configured (or the framework default) for each service. Then, still at import time, we swap out the submodules before any consumer can touch any of the submodule items (because importing `bsb.services.*` first imports `bsb.services`)

## The proposal

The `bsb.services.*` submodules will serve as "reference modules". They can provide all the necessary stubs and/or type hints for IDE tools etc to work, for example `from bsb.services.mpi import MPI` would work and know which elements exist on the singleton. An developers would know what they'd have to implement.

The user can configure an ordered list of providers to use for each service. If none of the providers are available for a service the framework errors. 

Service providers can be registered through the typical plugin package metadata entrypoint `bsb.providers` by advertising a **module** according to python entry points spec with the following special convention:

```
<bsb_service_name>_<provider_name> = "my.module"
```

with an optional 2nd paired entry point:

```
<bsb_service_name>_<provider_name>_loader = "other.module:my_loader"
```

The first entry point specifies the provider module which will be loaded if the provider is used. The service name is the name of the submodule. The provider name will be used by the user to choose which providers to use in either env or project options (script and CLI unavailable because they are not determinable at import time). Examples:

```
BSB_PROVIDE_MPI=mpi4py bsb compile  # Run the framework with the service provider `mpi4py` or error
BSB_PROVIDE_MPI=mpi4py,default bsb compile  # Run the framework with the service provider `mpi4py` or use the framework default (which noops in serial and errors in parallel
```

 Then we need to find the "loader", the loader can do things that the provider needs to do when it is actually going to be used, such as set up or configure things. The loader can also raise a `ProviderUnavailableError`, in which case we'll skip it and go to the next provider. The loader is resolved in this order:

* Check for a 2nd paired entry point, use the advertised object as loader.
* Import the module and check for a `_bsb_load_provider` method, if it exists, use it as loader.

If no loader exists we unconditionally and immediately try to use the module as provider for the service. A service provider that expects to be used conditionally and has long import times or which has complicated initialization logic, or impure side effects (I'm basically talking about `import mpi4py.MPI` for every point in the list here) should therefor provide a loader via the paired entry point to avoid that it causes import errors or side effects when we're only trying to check whether the provider is available first.

I think this system is quite simple to implement, in `bsb/services/__init__.py`:

```
import pkgutil

from bsb.options import get_module_option

for service in (name for finder, name, ispkg in pkgutil.iter_modules(__path__)):
    resolve_provider_module(service, get_module_option(f"provide_{service}"))
```

with then the logic described above implemented using `importlib.metadata` and `importlib`. If we place a submodule in `sys.modules[f"bsb.services.{service}"]` then it should never actually begin importing that file, and uses the module object we put in there instead. If that doesn't work because the import machinery is already going, then we can still add a call in each submodule to replace itself in `sys.modules`, which is supported and endorsed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the `services` module #200

The general idea

The proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor the services module #200

Description

The general idea

The proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Refactor the `services` module #200