Skip to content

gnrc_netif: only use prefix matching as tie-breaker in source selection#12408

Merged
fjmolinas merged 6 commits intoRIOT-OS:masterfrom
miri64:gnrc_netif/fix/src-addr-selection
Oct 24, 2019
Merged

gnrc_netif: only use prefix matching as tie-breaker in source selection#12408
fjmolinas merged 6 commits intoRIOT-OS:masterfrom
miri64:gnrc_netif/fix/src-addr-selection

Conversation

@miri64
Copy link
Member

@miri64 miri64 commented Oct 9, 2019

Contribution description

When source address selection is done, both RFC and comments in the code state, that a longest prefix match should only be used as a tie-breaker between more than one viable candidate. If there is only one address, there is

a) no need for a tie-breaker
b) in the case of either the destination address or the single remaining address being ULAs (which are considered to be of global scope) possibly not matching, as fd00::/7 and e.g. 2001::/8 do not have a common prefix.

(b) in fact causes the match function to return -1, causing the source address selection to return -1, causing the outer function to return the first address it found (which most often is the link-local address), causing e.g. a ping to an ULA to fail, even is there is a global address.

I also piggy-backed a reduction of code duplication (so I can set idx just at one point) and added some doc on candidate creation, because some things were unclear to me when I tried to wrap my head around the problem.

Testing procedure

  • Pinging the link-local address between two native instances should still select the link-local address of the sender (I tested with examples/gnrc_networking on native and sniffed tapbr0 while pinging):
    ifconfig 6 add 2001:db8::1
    ping6 fe80::d826:95ff:fe26:f1d5
    
  • When merging gnrc/netif: don't treat link local differently in prefix matching #12404, pinging fd00:dead:beef::1 from a examples/gnrc_networking node via an examples/gnrc_border_router node should still work (after the global addresses are set up on all nodes).
  • make -C tests/gnrc_netif flash test should still work for a board of your choice and should not work if you revert the fix to the prefix matching:
    • git revert HEAD~2; git revert HEAD~4; make -C tests/gnrc_netif

Issues/PRs references

Fixes an issue in source address selection with ULAs becoming obvious in #12404. Required to make #12404 mergable. (#12404 now is included here)

Requires #12425 for address resolution to still work fully (and tests/gnrc_ipv6_nib to pass)

@miri64 miri64 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Oct 9, 2019
@miri64 miri64 added this to the Release 2019.10 milestone Oct 9, 2019
@miri64 miri64 requested review from kaspar030 and waehlisch October 9, 2019 16:17
@miri64
Copy link
Member Author

miri64 commented Oct 9, 2019

After thinking about this some more, I think using _match_to_idx() in this function is completely the wrong approach, as if we have e.g. two global addresses starting with 0000::/1 assigned to the interface that win the source address selection so far and the destination address starts with 1000::/1 it will return -1 as none of the addresses selected matches the destination address by more than 1 bit, causing the source address selection to pick first_candidate (which is typically is the auto-generated link-local address). Rather, we should compare the best matches of the remaining addresses with the destination, as rule 8 states, and pick the best (or either one of them if they match equally best, even if it is 0).

@miri64
Copy link
Member Author

miri64 commented Oct 9, 2019

Done.

unsigned match;

if ((netif->ipv6.addrs_flags[i] == 0) ||
((filter != NULL) && _addr_anycast(netif, i)) ||
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See RFC 6724, page 30, point 3: This can be removed without alternative.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

I added a test for the corner case, that popped up due to the fix in #12404.

@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from 279a079 to 9d98410 Compare October 10, 2019 11:12
@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Squashed, rebased, added the commit from #12404 and added some more test cases.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

@fjmolinas @kb2ma can one of you maybe give this a look?

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

@kaspar030 made me aware off-line of section 2.2 in RFC 6724 regarding the function "commonPrefixLen()" that is used in rule 8. I adapted the code accordingly.

@kb2ma
Copy link
Member

kb2ma commented Oct 10, 2019

Looking into this now. The decision to merge this PR is the last issue before generation of the 2019.10 release branch.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Pushed a bit too early and added some fixes to that function now.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Sorry, yet another issue... I'll just say when I'm done... :(

@kb2ma
Copy link
Member

kb2ma commented Oct 10, 2019

No worries, @miri64. I need a little time to understand the issues anyway.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Section 2.2 in RFC 6724 actually makes one of the test cases invalid, so I removed it. The tests are now passing again. Sorry for these last minute changes.

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

@kb2ma do you mind if I squash? I somehow messed up my fixup commits, I just noticed...

@kb2ma
Copy link
Member

kb2ma commented Oct 10, 2019

Go right ahead. I would likely skip over the intermediate results anyway.

@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from 64fbe48 to be51b00 Compare October 10, 2019 13:16
@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Squashed

Go right ahead. I would likely skip over the intermediate results anyway.

At least for the review I think it helps if you go through it commit-wise (not all tests work for all commits though), as I tried to make my thinking process clearer with the history. Additionally, the PR contains some clean-up of now unnecessary code, that resulted from fixing the bugs and adapting it more in line with RFC 6724.

@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from be51b00 to 2ad840e Compare October 10, 2019 13:21
@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

(I'm now done, btw)

@kb2ma
Copy link
Member

kb2ma commented Oct 10, 2019

Sounds good. I've got a basic handle on the issues. Is the testing procedure in the PR description still valid?

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Yepp

@miri64
Copy link
Member Author

miri64 commented Oct 10, 2019

Except for the merging of #12404 part in (2), as that one is part of this PR now ;-).

@miri64 miri64 removed the State: waiting for other PR State: The PR requires another PR to be merged first label Oct 21, 2019
@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from 9c81c73 to d7c4eec Compare October 21, 2019 14:10
@miri64
Copy link
Member Author

miri64 commented Oct 21, 2019

Rebased to current master, no longer waiting for other PR.

@miri64 miri64 requested a review from a team October 21, 2019 14:12
* address.
* (so don't consider tentative addresses for source address
* selection) */
gnrc_netif_ipv6_addr_dad_trans(netif, i)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this line is not part of this PR, but could you clarify why addresses are not considered that have gnrc_netif_ipv6_addr_dad_trans() != 0? According to the doc of this function, the return value is:

the number of duplicate address detection transmissions already performed

How can we deduce that the address is tentative from that result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of DAD transmissions is encoded in the tentative state in our DAD implementation (that's why you see TNT[1..3] with the ifconfig command, not just TNT). So a value of 0 for already performed DAD transmissions means the address is not tentative. As the NS for DAD is sent as soon as the address is added, this is also true if you don't have the knowledge of such GNRC internals ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also doc for tentative state:

/**
* @brief Tentative states (with encoded DAD retransmissions)
*
* The retransmissions of DAD transmits can be decoded from this state by
* applying it as a mask to the [flags](gnrc_netif_ipv6_t::addrs_flags) of the
* address.
*/
#define GNRC_NETIF_IPV6_ADDRS_FLAGS_STATE_TENTATIVE (0x07U)

Copy link
Contributor

@fjmolinas fjmolinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing:

  • Pinging the link-local address between two native instances should still select the link-local address of the sender

image

  • pinging fd00:dead:beef::1 from a examples/gnrc_networking node via an examples/gnrc_border_router node should still work
ping6 fd00:dead:beef::1
2019-10-22 13:59:44,921 #  ping6 fd00:dead:beef::1
2019-10-22 13:59:45,987 # 12 bytes from fd00:dead:beef::1: icmp_seq=1 ttl=63 rssi=-33 dBm time=59.327 ms
2019-10-22 13:59:46,963 # 12 bytes from fd00:dead:beef::1: icmp_seq=2 ttl=63 rssi=-32 dBm time=36.240 ms
2019-10-22 13:59:47,920 # 
2019-10-22 13:59:47,924 # --- fd00:dead:beef::1 PING statistics ---
2019-10-22 13:59:47,929 # 3 packets transmitted, 2 packets received, 33% packet loss
2019-10-22 13:59:47,933 # round-trip min/avg/max = 36.240/47.783/59.327 ms
  • make -C tests/gnrc_netif flash test should still work for a board of your choice
.......................................................................
OK (71 tests)

Also this seems to fix the issue pointed out in #11818, it is always selecting the first lla EUI64 address.

@fjmolinas
Copy link
Contributor

git revert HEAD~2; git revert HEAD~4; make -C tests/gnrc_netif

@miri64 just to be sure after new commits have been added, can you point out the hash of the two commits to revert?

@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

git revert HEAD~2; git revert HEAD~4; make -C tests/gnrc_netif

@miri64 just to be sure after new commits have been added, can you point out the hash of the two commits to revert?

Sorry only noticed the activity in this PR now. I'll have a look at your review and then update :-).

@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

git revert HEAD~2; git revert HEAD~4; make -C tests/gnrc_netif

@miri64 just to be sure after new commits have been added, can you point out the hash of the two commits to revert?

Currently HEAD~2 would be fb1754e and HEAD~4 d8f6d82.

@fjmolinas
Copy link
Contributor

not work if you revert the fix to the prefix matching:

  • git revert HEAD2; git revert HEAD4; make -C tests/gnrc_netif

completed test procedure...

flags: 0x0
src_l2addr: 3E:E6:B5:22:FD:0B
dst_l2addr: 3E:E6:B5:22:FD:0A
~~ PKT    -  2 snips, total size:  61 byte
Timeout in expect script at "child.expect(r"OK \(\d+ tests\)")" (tests/gnrc_netif/tests/01-run.py:16)
  File "/home/francisco/workspace/RIOT/dist/pythonlibs/testrunner/__init__.py", line 29, in run
    testfunc(child)
  File "/home/francisco/workspace/RIOT/tests/gnrc_netif/tests/01-run.py", line 16, in testfunc
    child.expect(r"OK \(\d+ tests\)")
  File "/home/francisco/.local/lib/python3.6/site-packages/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "/home/francisco/.local/lib/python3.6/site-packages/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "/home/francisco/.local/lib/python3.6/site-packages/pexpect/expect.py", line 119, in expect_loop
    return self.timeout(e)
  File "/home/francisco/.local/lib/python3.6/site-packages/pexpect/expect.py", line 82, in timeout
    raise TIMEOUT(msg)

/home/francisco/workspace/RIOT/tests/gnrc_netif/../../Makefile.include:710: recipe for target 'test' failed

@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

Note, because I was confused for a moment: The test is supposed to fail...

@fjmolinas
Copy link
Contributor

@miri64 please squash!

@fjmolinas
Copy link
Contributor

@miri64 please squash!

Lets trigger tests in murdock as well!

When source address selection is done, both RFC and comments in the code
state, that a longest prefix match should *only* be used as a
tie-breaker between more than one viable candidate. If there is only one
address, there is

a) no need for a tie-breaker
b) in the case of either the destination address or the single remaining
   address being ULAs ([which are considered to be of global scope]
   [RFC4193]) possibly not matching, as `fd00::/7` and e.g. `2001::/8`
   do not have a common prefix.

(b) in fact causes the match function to return -1, causing the source
address selection to return -1, causing the outer function to return the
first address it found (which most often is the link-local address),
causing e.g. a ping to an ULA to fail, even is there is a global
address.

[RFC4193]: https://tools.ietf.org/html/rfc4193
`_match_to_idx()` was removed from source address selection (which was
the only one setting the filter parameter to a non-NULL value), so it
is the parameter is not needed anymore.
- ULA destination with global address on interface
- Deprecated addresses
@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

Squashed.

@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from eea3cf1 to fa14eea Compare October 24, 2019 14:04
@miri64 miri64 added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Oct 24, 2019
@miri64 miri64 force-pushed the gnrc_netif/fix/src-addr-selection branch from fa14eea to 020af41 Compare October 24, 2019 14:06
@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

Removed the empty commit that happened somehow due to squashing the revert commit on the original.

Copy link
Contributor

@fjmolinas fjmolinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Murdock si happy! ACK!

@fjmolinas fjmolinas merged commit 0364c39 into RIOT-OS:master Oct 24, 2019
@miri64 miri64 deleted the gnrc_netif/fix/src-addr-selection branch October 24, 2019 15:21
@miri64
Copy link
Member Author

miri64 commented Oct 24, 2019

Thanks for the review!

@fjmolinas
Copy link
Contributor

Thanks for the review!

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants