Skip to content

examples/dtls-sock: fix timeout msg stays in mbox#13495

Merged
MichelRottleuthner merged 2 commits intoRIOT-OS:masterfrom
pokgak:pr/sock_dtls/fix_timeout_when_waiting
Mar 9, 2020
Merged

examples/dtls-sock: fix timeout msg stays in mbox#13495
MichelRottleuthner merged 2 commits intoRIOT-OS:masterfrom
pokgak:pr/sock_dtls/fix_timeout_when_waiting

Conversation

@pokgak
Copy link
Contributor

@pokgak pokgak commented Feb 27, 2020

Contribution description

This PR fix the timeout issue mentioned in #12907 (comment).

When executing dtlss start, then waiting for ~20 seconds and then running dtlsc I get the same as you. [timeout]

10 seconds is the default timeout used for a handshake in sock_dtls. The problem is that the timeout triggers after a successful decrypt, sending a msg to the mbox but this message is not read/removed from the mbox because of the successful decryption. Because of that the next time mbox is checked, it will immediately return the timeout msg from last time.

Testing procedure

  1. start a dtls server and wait at least 10 seconds.
  2. send a message from a dtls client to the server after the 10 seconds.

With this PR, the client should not return a timeout error (110).

@benpicco benpicco added Area: pkg Area: External package ports Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Feb 27, 2020
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it on native and it solves the lockup/timing issue. Also see the comments below.

Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with examples/dtls-sock on native and iotlab-m3 and confirm it works.
Code looks fine too -> ACK.

Just for the record: I also tried it on pba-d-01-kw2x but it failed with Error creating session: zd on the client side. Also the %z string format seems to be unsupported at least on iotlab-m3 and pba-d-01-kw2x so the print messages don't show proper numbers.
But that is also the case on master so no blocker for this PR.
@pokgak can you look into these issues?

@cgundogan
Copy link
Member

Does this PR obsoletes #12959 ?

@MichelRottleuthner MichelRottleuthner added the CI: needs squashing Commits in this PR need to be squashed; If set, CI systems will mark this PR as unmergable label Feb 28, 2020
@MichelRottleuthner
Copy link
Contributor

Hmm after looking at #12959 - as it is implemented now you could actually just get rid of the timer. Is there still a reason to keep it?

@pokgak
Copy link
Contributor Author

pokgak commented Feb 28, 2020

Hmm after looking at #12959 - as it is implemented now you could actually just get rid of the timer. Is there still a reason to keep it?

yeah we could get rid of xtimer_set, but timeout should still be recalculated after sock_udp_recv.

Does this PR obsoletes #12959 ?

@cgundogan because this PR removes sending timeout message, there will be no timeout message to fill the mbox anymore. So I would say, yes. If you have a minimal setup to reproduce the issue we should test this, just to be sure.

@pokgak
Copy link
Contributor Author

pokgak commented Feb 28, 2020

I missed the timers in sock_dtls_send before. Added the changes in 629607f.

I also tried it on pba-d-01-kw2x but it failed with Error creating session: zd on the client side.

Hmm this should not be the case. Can you enable ENABLE_DEBUG in sock_dtls.c and paste the output here?

Also the %z string format seems to be unsupported at least on iotlab-m3 and pba-d-01-kw2x so the print messages don't show proper numbers.

I found this in the RIOT coding conventions:

  • When printing a size_t
    * use %u and cast the variable to (unsigned) because newlib-nano does
    not support %zu

I put all %zd to %d changes in a45744d. I don't have the board to test this if you could confirm that this work maybe we could include this too.

@MichelRottleuthner
Copy link
Contributor

I tried again on pba-d-01-kw2x. The string format is good now. Also the problem with the board comes form misbehaving AUTO_CCA of the radio and thus is unrelated to this PR / tinydtls.

I put all %zd to %d changes in a45744d. I don't have the board to test this if you could confirm that this work maybe we could include this too.

I'm fine with including it, just leave it as a separate commit.

yeah we could get rid of xtimer_set, but timeout should still be recalculated after sock_udp_recv.

Removing the timer (timeout callback) would be best I think. sock_udp_recv already has a timeout function and the handle_message call cannot be interrupted anyway. Updating the timeout is then still needed of course.

@pokgak
Copy link
Contributor Author

pokgak commented Mar 3, 2020

I removed the timeout callback in sock_dtls_recv.

For sock_dtls_send, I replaced the timeout callback with xtimer_msg_receive_timeout. To test that the timeout here is working, I use the following patch:

diff --git a/examples/dtls-sock/dtls-client.c b/examples/dtls-sock/dtls-client.c
index 78691d69a..41a031130 100644
--- a/examples/dtls-sock/dtls-client.c
+++ b/examples/dtls-sock/dtls-client.c
@@ -122,13 +122,13 @@ static int client_send(char *addr_str, char *data, size_t datalen)
         return -1;
     }

-    res = sock_dtls_session_create(&dtls_sock, &remote, &session);
-    if (res < 0) {
-        printf("Error creating session: %d\n", (int)res);
-        sock_dtls_close(&dtls_sock);
-        sock_udp_close(&udp_sock);
-        return -1;
-    }
+    // res = sock_dtls_session_create(&dtls_sock, &remote, &session);
+    // if (res < 0) {
+    //     printf("Error creating session: %d\n", (int)res);
+    //     sock_dtls_close(&dtls_sock);
+    //     sock_udp_close(&udp_sock);
+    //     return -1;
+    // }

     if (sock_dtls_send(&dtls_sock, &session, data, datalen) < 0) {
         puts("Error sending data");
diff --git a/pkg/tinydtls/contrib/sock_dtls.c b/pkg/tinydtls/contrib/sock_dtls.c
index 07629d96a..5e3941b91 100644
--- a/pkg/tinydtls/contrib/sock_dtls.c
+++ b/pkg/tinydtls/contrib/sock_dtls.c
@@ -20,7 +20,7 @@
 #include "net/sock/dtls.h"
 #include "net/credman.h"

-#define ENABLE_DEBUG (0)
+#define ENABLE_DEBUG (1)
 #include "debug.h"
 #include "dtls_debug.h"

and send a message from the client to the server like usual. You should see the client timed out on the shell:

dtlsc fe80::4444:79ff:fe39:b840 heyy
sock_dtls: event connect
sock_dtls: handshake process timed out
Error sending data
Terminating

With that there are no more timeout message put into the mbox. :)

@aabadie aabadie added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Mar 3, 2020
@MichelRottleuthner
Copy link
Contributor

@pokgak please rebase and squash

@pokgak pokgak force-pushed the pr/sock_dtls/fix_timeout_when_waiting branch from b0f3614 to e0a0270 Compare March 9, 2020 14:35
@pokgak
Copy link
Contributor Author

pokgak commented Mar 9, 2020

Rebased and squashed. I tested again with the test case described in OP and normal send between client and server on native and samr21-xpro. All seems okay.

@MichelRottleuthner MichelRottleuthner added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: needs squashing Commits in this PR need to be squashed; If set, CI systems will mark this PR as unmergable CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Mar 9, 2020
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this again -> works as expected, (re-)ACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: pkg Area: External package ports CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants