Tokio runtime being blocked by nitox.

I observed some very interesting behavior while using this crate in a few services of mine. I've observed that running multiple tasks on the tokio runtime along with a nitox client will cause the other tasks to never be executed, or receive very minimal execution, while nitox is running.

Of course, first I suspected that it was my code, but I reviewed it thoroughly and did not find anything obvious which would block the runtime. After reviewing the networking code in this crate, it looks like `parking_lot::RwLock` is used as a fairly fundamental building block for the nats client and the connection multiplexer. **This is what is causing the blocking.**

The `parking_lot` crate does provide smaller and more efficient locking mechanisms than the `std::sync` primitives, but they still block the calling thread when attempting to acquire read/write access. The documentation explicitly states this. Take `write` for example: [`RwLock`'s type def](https://docs.rs/parking_lot/0.7.1/parking_lot/type.RwLock.html), which uses [`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html#method.write) under the hood. Docs state:

> Locks this RwLock with exclusive write access, blocking the current thread until it can be acquired. [...]

The blocking locks are used primarily in the [client](https://github.com/YellowInnovation/nitox/blob/master/src/client.rs), [net/connection](https://github.com/YellowInnovation/nitox/blob/master/src/net/connection.rs) & [net/connection_inner](https://github.com/YellowInnovation/nitox/blob/master/src/net/connection_inner.rs).

### what do we do about this?
First, I think that this is not a good thing. We definitely do not want nitox to block the event loop, because then it essentially defeats the purpose of using an event loop. Pretty sure we can all agree with that.

As a path forward, I would propose that we use a non-blocking message passing based approach. A pattern which I have used quite successfully in the past. Details below.

### message passing instead of locks
- the public nats client type will continue to present roughly the same interface.
- when the nats client is instantiated/connected, the nats client will spawn a new private task onto the runtime which **owns the connection**. Just for reference here, we can call that spawned task the `daemon`.
- when the daemon is first spawned, it will be given a futures mpsc receiver. This mpsc channel will communicate oneshot channels of `Result<NatsClient, NatsError>` or the like.
- the public nats client will send oneshot receivers over the channel to request a clone of the nats connection which the daemon is managing.
- because the daemon owns the connection, the memory does not need to be shared and therefore no locks are needed. This applies to the multiplexer as well. The daemon's communication channel (or channels) will be able to receive commands to update the multiplexer for new subscriptions &c.
- we can also setup the public nats client in such a way that when it is dropped, it will issue a command to the daemon which will cause it to shutdown and clean up its resources. This will resolve #22 & #6 (which is not actually resolved).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokio runtime being blocked by nitox. #24

what do we do about this?

message passing instead of locks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tokio runtime being blocked by nitox. #24

Description

what do we do about this?

message passing instead of locks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions