Giter VIP home page Giter VIP logo

Comments (11)

T-X avatar T-X commented on September 24, 2024

When hard-coding the ifindex to the one from ip6gre1 (the interface the PIM bootstrap message was received on) like this:

$ git diff
diff --git a/src/netlink.c b/src/netlink.c
index a1b2f1c..5d4fbd9 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -214,7 +214,7 @@ getmsg(struct rtmsg *rtm, int msglen, struct rpfctl *rpf)
     parse_rtattr(rta, RTA_MAX, RTM_RTA(rtm), msglen - sizeof(*rtm));
     
     if (rta[RTA_OIF]) {
-       int ifindex = *(int *) RTA_DATA(rta[RTA_OIF]);
+       int ifindex = 134;
        
        for (vifi = 0, v = uvifs; vifi < numvifs; ++vifi, ++v) {

Then the RP-Set is refreshed successfully:

---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::1 Prio: 0 Timeout: 100
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841::1(fe80::1%ip6gre1)
     ff00::/8                                 0    150  90

02:44:00.280 NETLINK: ask path to fd5c:725:2841::1
02:44:00.280 NETLINK: vif 0, ifindex=134
02:44:00.280 NETLINK: gateway is fe80::5054:ff:fe0c:bbeb
---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::1 Prio: 0 Timeout: 155
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841::1(fe80::1%ip6gre1)
     ff00::/8                                 0    150  145

Edit: Although the "NETLINK: gateway is fe80::5054:ff:fe0c:bbeb" is still bogus, that's my default router.

$ ip -6 r s | grep default
default via fe80::5054:ff:fe0c:bbeb dev wlp61s0 proto ra metric 1024 expires 1787sec hoplimit 64 pref medium

There should be no need for a gateway to reach the bootstrap router fd5c:725:2841::1 as it's an on-link host on ip6gre1.

from pim6sd.

troglobit avatar troglobit commented on September 24, 2024

That looks really odd, can't say I fully understand your setup, but I think I may have a similar problem in my limited setup with CORE. If you run a simple one-router setup I can get it to workΒΉ, but with a simple three-routers-in-a-row I cannot ... and I also see the loss of BSR on the edge routers.

I'm trying out different setup right now and will continue debugging from my end. Hopefully it's the same root cause.


ΒΉ I have to add the following two lines from pim6sd.conf(5)

cand_bootstrap_router;
cand_rp;

from pim6sd.

T-X avatar T-X commented on September 24, 2024

Here's a script which should reproduce the problem within network namespaces:

https://gist.github.com/T-X/a1352078c24f5419c4566f0290f58dde

It's a simple two router setup, each of them has their own network namespace. And they each have one network namespace attached to them for downstream hosts:

[NS:client0] <---> [NS:router0] <---> [NS:router1] <---> [NS:client1]

As you can see in the pimtest-router1.log the BSR is first added correctly but later times out and in between the bootstrap messages get rejected.

The output is slightly different, but seems like the same issue:

19:17:28.578 NETLINK: ask path to fd5c:725:2841::1
19:17:28.578 warning - netlink get_route: Network is unreachable
19:17:28.579 receive_pim6_bootstrap: can't find a route to the BSR(fd5c:725:2841::1)

It's maybe a PIM specific thing (still learning / reading about it), but I'm confused that pim6sd tries to retrieve a unicast route to the BSR in userspace. Shouldn't that be something the kernel should just do?

from pim6sd.

T-X avatar T-X commented on September 24, 2024

And adding a default route to the end of setup() in the pimtest-bootstrap.sh makes things work:

$NSR1 ip -6 route add default dev wan0

=>

---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::1 Prio: 0 Timeout: 120
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841::1(fe80::5ca2:fff:feee:be08%wan0)
     ff00::/8                                 0    150  110

20:05:22.359 NETLINK: ask path to fd5c:725:2841::1
20:05:22.359 NETLINK: vif 0, ifindex=4
---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::1 Prio: 0 Timeout: 155
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841::1(fe80::5ca2:fff:feee:be08%wan0)
     ff00::/8                                 0    150  145

However I believe that should not be necessary. Especially as the interface with the default route might be an interface which was not added to / was disabled for pim6sd.

from pim6sd.

troglobit avatar troglobit commented on September 24, 2024

Awesome, thanks for the script that will come in handy! I believe CORE also uses namespaces to set things up, only using a GUI. It's really handy since it can start up OSPFv3 and other things if you want.

Anyway, I get the same type of netlink error/warning in my setup (same topology). From what I can gather it seems the code in netlink.c fails to query the IPv6 unicast routing table for the route to the neighbor in the BSR message, what seems to be failing is actually retrieving the gateway for the route, which makes the code fall back to return the source address. The pimd code seems to require this to figure out which neighbor (found via HELLO) to associate with the BSR.

So yeah, you should definitely not need to set up default routes.

from pim6sd.

T-X avatar T-X commented on September 24, 2024

Ok, found the issue: The code tries to copy the 16 bytes IPv6 address from a struct sockaddr_in6. However it should be copied from the struct in6_addr sin6_addr within the sockaddr_in6. Otherwise it's off by the size of sin6_, sin6_port and sin6_flowinfo.

When I change that then it returns the ifindex of the correct interface, ip6gre1! \o/ And no more vanishing bootstrap router in the PR-Set.

I'll clean up my debugging mess and make a pull-request in a minute.

from pim6sd.

troglobit avatar troglobit commented on September 24, 2024

What you found it?! Man, you're like a relentless bloodhound ... I was just going to bed, but now I have to see what you've come up with :-)

from pim6sd.

troglobit avatar troglobit commented on September 24, 2024

Wow, yeah I see now! Suddenly my setup with two-routers works perfectly \o/

diff --git a/src/netlink.c b/src/netlink.c
index 63fcb82..5787d1f 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -64,7 +64,7 @@ static int addattr32(struct nlmsghdr *n, int maxlen, int type, struct sockaddr_i
        rta = (struct rtattr *)(((char *)n) + NLMSG_ALIGN(n->nlmsg_len));
        rta->rta_type = type;
        rta->rta_len = len;
-       memcpy(RTA_DATA(rta), &data, 16);
+       memcpy(RTA_DATA(rta), &data.sin6_addr, 16);
        n->nlmsg_len = NLMSG_ALIGN(n->nlmsg_len) + len;
 
        return 0;

I'm off to bed now, 😴 looking forward to merging your PR tomorrow, great work! πŸ˜ƒπŸ‘

from pim6sd.

T-X avatar T-X commented on September 24, 2024

Exactly this πŸ˜„. I've changed the passed parameter in my PR (I usually prefer just using pointers when the thing passed is larger than the word size), but change as you please :-).

from pim6sd.

T-X avatar T-X commented on September 24, 2024

And yaiy, great that this works for you too! πŸŽ‰

from pim6sd.

troglobit avatar troglobit commented on September 24, 2024

Yeah, calling conventions are important. I've started doing some cleanup of the code base to both simplify it and make it more readable. First was to make it compile, this current step was to make it work, the next step is to build with different compilers, cross-compile and then unleash Coverity Scan on it. It's slowly coming along :-)

Thank you again for your work and patience, both you and @mweinelt !

from pim6sd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.