Skip to content

Conversation

@MortronMeymo
Copy link

If close happened immediately after disconnect the following happened:

  1. disconnect sets comm state to DISCONNECTING and starts a new thread to actually do the disconnect.
  2. close sees this state and only sets closePending to true.
  3. The new thread from 1 does some cleanup and sends the MQTT disconnect message and waits for it to be sent.
  4. The disconnect thread calls shutdownConnection(...) which simply returns and does nothing if closePending is true.

So if 2 happens before 4 no one executes the shutdownConnection logic and the client is left in a unusable but also not cleaned up state.

Triggering this race condition is extremly easy if someone is just trying to properly cleanup a connected client with the following code:
client.disconnect();
client.close();

Simply removing the check for closePending in the start of shutdownConnection fixes this.
Removing this check should also not introduce new issues, since the close() itself does not cause a call to shutdownConnection. Which means that if there is another race condition here then such a condition can already be triggered by other means as well. There is a small gap where in theory the user code could attempt to create a new connection, but that will fail,
since the connect call also checks the closePending flag.

Closes: 1093

Please make sure that the following boxes are checked before submitting your Pull Request, thank you!

  • This change is against the develop branch, not master.
  • You have signed the Eclipse ECA
  • All of your commits have been signed-off with the correct email address (the same one that you
    used to sign the CLA) Hint: use the -s argument when committing.
  • If This PR fixes an issue, that you reference the issue below. OR if this is a new issue that
    you are fixing straight away that you add some Description about the bug and how this will fix it.
  • If this is new functionality, You have added the appropriate Unit tests.

If close happened immediately after disconnect the following happened:
1. disconnect sets comm state to DISCONNECTING
   and starts a new thread to actually do the disconnect.
2. close sees this state and only sets closePending to true.
3. The new thread from 1 does some cleanup
   and sends the MQTT disconnect message and waits for it to be sent.
4. The disconnect thread calls shutdownConnection(...)
   which simply returns and does nothing if closePending is true.

So if 2 happens before 4 no one executes the shutdownConnection logic
and the client is left in a unusable but also not cleaned up state.

Triggering this race condition is extremly easy if someone is just
trying to properly cleanup a connected client with the following code:
  client.disconnect();
  client.close();

Simply removing the check for closePending in the start of
shutdownConnection fixes this.
Removing this check should also not introduce new issues,
since the close() itself does not cause a call to shutdownConnection.
Which means that if there is another race condition here
then such a condition can already be triggered by other means as well.
There is a small gap where in theory the user code could attempt
to create a new connection, but that will fail,
since the connect call also checks the closePending flag.

Closes: eclipse-paho#1093
Signed-off-by: Morten Mey <morten.mey@esrlabs.com>
@MortronMeymo
Copy link
Author

Closes #1093

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant