Giter VIP home page Giter VIP logo

babeld's Introduction

Babel
=====

Babel is a loop-avoiding distance-vector routing protocol roughly
based on HSDV and AODV, but with provisions for link cost estimation
and redistribution of routes from other routing protocols.


Installation
============

    $ make
    $ su -c 'make install'

If compiling for OpenWRT, you will probably want to say something like

    $ make CC=mipsel-linux-gcc PLATFORM_DEFINES='-march=mips32'

On Mac OS X, you'll need to do

    $ make LDLIBS=''


Setting up a network for use with Babel
=======================================

1. Set up every node's interface
--------------------------------

On every node, set up the wireless interface:

    # iwconfig eth1 mode ad-hoc channel 11 essid "my-mesh-network"
    # ip link set up dev eth1

2. Set up every node's IP addresses
-----------------------------------

You will need to make sure that all of your nodes have a unique IPv6
address, and/or a unique IPv4 address.

On every node, run something like:

    # ip addr add 192.168.13.33/32 dev eth1
    # ip -6 addr add $(generate-ipv6-address -r)/128 dev eth1

You will find the generate-ipv6-address utility, which can generate random
IPv6 addresses according to RFC 4193, on

      https://www.irif.fr/~jch/software/files/


A note about tunnels and VPNs
-----------------------------

Some VPN implementations (notably OpenVPN and Linux GRE) do not
automatically add an IPv6 link-local address to the tunnel interface.
If you attempt to run Babel over such an interface, it will complain
that it ``couldn't allocate requested address''.

The solution is to manually add the link-local address to the
interface.  This can be done by running e.g.

    # ip -6 addr add $(generate-ipv6-address fe80::) dev gre0


3. Start the routing daemon
---------------------------

Run Babel on every node, specifying the set of interfaces that it
should consider:

    # babeld eth1

If your node has multiple interfaces which you want to participate in
the Babel network, just list them all:

    # babeld eth0 eth1 sit1


4. Setting up an Internet gateway
---------------------------------

If you have one or more Internet gateways on your mesh network, you
will want to set them up so that they redistribute the default route.
Babel will only redistribute routes with an explicit protocol
attached, so you must say something like:

    # ip route add 0.0.0.0/0 via 1.2.3.4 dev eth0 proto static

In order to redistribute all routes, you will say:

    # babeld -C 'redistribute metric 128' eth1

You may also be more selective in the routes you redistribute, for
instance by specifying the interface over which the route goes out:

    # babeld -C 'redistribute if eth0 metric 128' eth1

or by constraining the prefix length:

    # babeld -C 'redistribute ip ::/0 le 64 metric 128' \
             -C 'redistribute ip 0.0.0.0/0 le 28 metric 128' \
             eth1

You may also want to constrain which local routes (routes to local
interface addresses) you advertise:

    # babeld -C 'redistribute local if eth1' -C 'redistribute local deny' \
             -C 'redistribute metric 128' \
             eth1

-- Juliusz Chroboczek

babeld's People

Contributors

boutier avatar christf avatar dando-real-ita avatar danielg avatar domt4 avatar dtaht avatar fblaese avatar glondu avatar gwendocg avatar hnrgrgr avatar ineol avatar infrastation avatar jcristau avatar jech avatar jmuchemb avatar kerneis avatar kostko avatar mathiashro avatar misterda avatar mweinelt avatar neocturne avatar polynomialdivision avatar tobast avatar tohojo avatar tomaz1502 avatar tpetazzoni avatar wkolod avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

babeld's Issues

Issues with the Makefile

Currently, the Makefile has two issues.

  1. Dependencies on header files are not tracked; which means that if a header is touched, files including this header won't be recompiled. This is fixed by my PR.

  2. The  version.h file depends on git, but Make doesn't know when a new commit has been made, and then cannot regenerate the file. This leads to babeld -h reporting the wrong commit. The scenario is as follow: compile once babeld, make a new commit touching a file that does not depend on version.h, run make, see ./babeld -h reporting the old commit id.
    A simple fix would be to force the regeneration of version.h, but that means that all the compilation units depending on it are recompiled at each run. To fix this new problem, we can add a new compilation unit version.c, version.h that would export the version string as a function.

Interface Regex?

Is it possible to specify a regex for interface names? I want to automatically speak babeld over all interfaces that are named wg_XXX.

babeld 1.12.1 build failure

👋 trying to build the latest release, but run into some build issue. The error log is as below:

build error
message.c:503:16: error: member reference base type 'const unsigned char' is not a structure or union
            if(IN6_IS_ADDR_MULTICAST(to))
               ^~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/usr/include/netinet6/in6.h:301:45: note: expanded from macro 'IN6_IS_ADDR_MULTICAST'
#define IN6_IS_ADDR_MULTICAST(a)        ((a)->s6_addr[0] == 0xff)
                                         ~~~^ ~~~~~~~
message.c:569:16: error: member reference base type 'const unsigned char' is not a structure or union
            if(IN6_IS_ADDR_MULTICAST(to))
               ^~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/usr/include/netinet6/in6.h:301:45: note: expanded from macro 'IN6_IS_ADDR_MULTICAST'
#define IN6_IS_ADDR_MULTICAST(a)        ((a)->s6_addr[0] == 0xff)
                                         ~~~^ ~~~~~~~
2 errors generated.

full build log, https://github.com/Homebrew/homebrew-core/actions/runs/3256497315/jobs/5347104359
relates to Homebrew/homebrew-core#102969
relates to Homebrew/homebrew-core#113206

README in markdown

Do you want to maybe use a Markdown version of the README by appending the .md extension to the file README.

Persist routes across restarts

Currently a restart of babeld will cause all routes in the kernel rt to disappear, breaking routing from/across a node even though connectivity is still available.

If babeld could leave routes in the kernel intact on restarts this would allow for a more graceful migration during upgrades/rebuilds. That would be immensely helpful.

I'm noticing this alot on my laptop, when my package manager rolls out a new version (often due to rebuilds, because dependencies were bumped) of babeld and does a restart. Through this I loose connectivity for about 15-20 seconds.

IPv4 redistribution broken in babeld 1.8.1

Hi,

it seems like the ipv4 routes aren't announced with babel 1.8.1. I did a git bisect and commit 4f4e3cb is the first bad one. After removing the line filter->src_plen_ge += 96; from this commit the ipv4 route gets announced again.

in the config I have the line
redistribute ip 10.222.0.0/16 allow

my routing table has the entry
10.222.192.0/21 dev batems proto static scope link

Denying routes in install filter doesn't work

When I have something like this:

install ip 2001:0db8::/48 allow
install deny

babled still installs routes received from neigbours outside of the given prefix into the kernel. I'm sending routes to a different table as a workaround but this shouldn't be necessary. Right?

install ip 2001:0db8::/48 allow
install table 666

Incidentally the routes that should be going to table 666 don't seem to show up there which is weird but I don't care about:

# ip -6 route show table 666
Error: ipv6: FIB table does not exist.
Dump terminated

--Daniel

interface marked as down while it is up

After some run-time I saw interfaces bing marked as down in babeld via dump command.
removing the interface and re-adding it again did not make this interface usable again.
ifconfig showed the interface as UP.
running ip link set device up did not affect the status displayed by dump on the babel socket.

I am not sure yet what is wrong, but the behavior is odd, so I am capturing it and the investigation here.

bugs in parse_hello_subtlv, parse_ihu_subtlv, parse_request_subtlv, parse_seqno_request_subtlv, and parse_other_subtlv

babeld/message.c

Lines 183 to 195 in 3b741cb

parse_hello_subtlv(const unsigned char *a, int alen,
unsigned int *timestamp_return, int *have_timestamp_return)
{
int type, len, i = 0, have_timestamp = 0;
unsigned int timestamp = 0;
while(i < alen) {
type = a[0];
if(type == SUBTLV_PAD1) {
i++;
continue;
}

Line 190: it should be a[i] instead of a[0]

babeld/message.c

Lines 234 to 248 in 3b741cb

parse_ihu_subtlv(const unsigned char *a, int alen,
unsigned int *timestamp1_return,
unsigned int *timestamp2_return,
int *have_timestamp_return)
{
int type, len, i = 0;
int have_timestamp = 0;
unsigned int timestamp1 = 0, timestamp2 = 0;
while(i < alen) {
type = a[0];
if(type == SUBTLV_PAD1) {
i++;
continue;
}

Line 244: it should be a[i] instead of a[0]

babeld/message.c

Lines 292 to 303 in 3b741cb

parse_request_subtlv(int ae, const unsigned char *a, int alen,
unsigned char *src_prefix, unsigned char *src_plen)
{
int type, len, i = 0;
int have_src_prefix = 0;
while(i < alen) {
type = a[0];
if(type == SUBTLV_PAD1) {
i++;
continue;
}

Line 299: it should be a[i] instead of a[0]

babeld/message.c

Lines 348 to 358 in 3b741cb

parse_seqno_request_subtlv(int ae, const unsigned char *a, int alen,
unsigned char *src_prefix, unsigned char *src_plen)
{
int type, len, i = 0;
while(i < alen) {
type = a[0];
if(type == SUBTLV_PAD1) {
i++;
continue;
}

Line 354: it should be a[i] instead of a[0]

babeld/message.c

Lines 395 to 404 in 3b741cb

parse_other_subtlv(const unsigned char *a, int alen)
{
int type, len, i = 0;
while(i < alen) {
type = a[0];
if(type == SUBTLV_PAD1) {
i++;
continue;
}

Line 400: it should be a[i] instead of a[0]

Issue with 22+ devices in a mesh

Babeld has been working great for me with a small number of devices. When I add the 22nd device to a mesh network the amount of network chatter between machines went up tremendously; to the point that the management overhead prevents other data from getting transferred. It is reproducible in my environment.

I have tried reducing the hello interval and traced through the code to see what is happening. Have not spotted a hard limit on the number of devices or routes that can be supported. I am seeing evidence that a device gets dropped the set of neighbors and then new packets from that device trigger the new neighbor behavior.

Any suggestions on possible causes?

feature request: add support for specifying interface with glob

In my particular deployment of babeld together with wireguard, to work around the limitation of the wireguard routing model, namely AllowedIPs, I had to create an interface for each peer. Thus for the addition and removal of peers, I can only regenerate the configuration file for babeld then restart it, leaving an annoying time window before the routes converge again. I think it would be nice if support for specifying interface with glob can be added to babeld, especially for the growing deployments of wireguard .

improve route updates from kernel

Taken from the Mailing List from a mail from @jech


The redistribution code in babeld is very primitive:

  1. the code makes a full route dump whenever something changes and
    computes the difference, rather than receiving incremental deltas
    from the kernel;
  2. it uses a quadratic algorithm to compute the difference.

Inject routes with different mtu?

Is it possible to specify for routes a different mtu? For example I want to inject the default from a specific gateway with another mtu.

(not sure if I'm allowed to ask here, if u feel that this is spamming your github just let me know)

babeld advertises neighbours route instead of own xroute

Since commit d05ec6b, babeld announces it's neighbours route instead of it's own xroute, when the routes are identical.

For example:

Routers A, B and C are all connected over the same link.
Routers A and B announce 0.0.0.0/0.
Router C then sees:

  • The route from A (identified by router-id) over B
  • The route from B over A

Both routes seen by C therefore have higher metric, then they should have (both rxcost 96 -> 192).

It looks like A and B are announcing the wrong route to C.

This happens to all IPv4 routes with identical prefix.
This problem does not occur with IPv6 routes.

Check Interfaces in add_interface

Is there any reason why you don't call check_interfaces here? Would that not reduce latency when adding a new interface at runtime? A bit of experimentation on my part seems to show reduced latency when I make that change. That having been said, I'm new to this repo, so I might be misinterpreting expected behavior.

how to troubleshoot routes not getting installed

Hey there! Trying to get babeld working on LibreMesh :)

tcpdump shows me what looks like good babeling, but the routes I expect to see are never getting installed. Wondering what could be going wrong / where to look next. Is there a way to determine what table babeld tries to install to?

Here are the messages that look good to me:

Received router-id 02:27:22:ff:fe:3e:06:55 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.64.32.66/32 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:27:22:ff:fe:52:43:3b from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.98.2/32 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:27:22:ff:fe:5e:c0:9e from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.38.130/32 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:27:22:ff:fe:5e:c1:dc from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.42.2/32 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:05:f4:c1 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.130.192/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:4b:eb from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.20.0/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:7c:1b from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.64.32.64/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:ce:51 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.38.128/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:dd:7b from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.42.64/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:dd:a6 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.42.0/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0b:dd:cb from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.139.128/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0c:4a:7b from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.137.192/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received router-id 02:90:a9:ff:fe:0c:4b:bb from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.
Received update for 100.65.95.64/26 from fe80::290:a9ff:fe0b:dd7b on wlan0-adhoc.

Unfortunately routes for those 100.x.x.x ips never show up in my route table. I'm using 1.8.2.

Babeld does not function properly

I configured the NIC following the steps given. However, the following prompt appears when executing the last command.

zhangmazi@zhangmazi-virtual-machine:$ sudo iwconfig wlx90de8074df25 mode ad-hoc channel 1 essid "my-mesh"
zhangmazi@zhangmazi-virtual-machine:
$ sudo ip link set up dev wlx90de8074df25
zhangmazi@zhangmazi-virtual-machine:$ sudo ip addr add 192.168.13.2 dev wlx90de8074df25
zhangmazi@zhangmazi-virtual-machine:
$ sudo ip -6 addr add $(generate-ipv6-address -r)/128 dev wlx90de8074df25
zhangmazi@zhangmazi-virtual-machine:~$ sudo babeld wlx90de8074df25
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address
send: Cannot assign requested address

Can you give some advice?

Upper Bound on Interface Count?

Based on my reading of the get_old_if function, we can have at most 1024 interfaces total for the entire lifetime of the process because we never clean the old_if data structure until shut down. Is this the case or am I misreading the code?

Looks like we have duplication memset inside kernel_interface_operational

kernel_interface_operational(const char *ifname, int ifindex)
{
struct ifreq req;
int s, rc;
int flags = link_detect ? (IFF_UP | IFF_RUNNING) : IFF_UP;

s = socket(PF_INET, SOCK_DGRAM, 0);
if(s < 0)
    return -1;

memset(&req, 0, sizeof(req));
memset(&req, 0, sizeof(req));
strncpy(req.ifr_name, ifname, sizeof(req.ifr_name));
rc = ioctl(s, SIOCGIFFLAGS, &req);
close(s);
if(rc < 0)
    return -1;
return ((req.ifr_flags & flags) == flags);

}

Do not announce prefix?

If I want to not announce a prefix, I have to add those two config lines:

config filter
        option type 'redistribute'
        option ip '2001:xx:xx:xx::/64'
        option local 'true'
        option action 'deny'

config filter 
        option type 'redistribute' 
        option ip '2001:xx:xx:xx::/64' 
        option action 'deny' 

If one is missing, the routes shows up at other routers. I do not understand why those two lines are necessary?

IPv6 neighbour discovery fails when babeld is running

I'm one of the maintainer of re6st (https://re6st.nexedi.com).

We are using latest version of babeld with hmac (with master merged on hmac) : https://lab.nexedi.com/nexedi/babeld/commits/master

Since we upgraded babeld on version 1.8.4 (we were using v1.6.2 previously), we experience from time to time a problem of networking when babeld is running on machines of the same LAN: we can't ping the link local IPv6 address of one machine. If we debug a bit further, we can see that the kernel of the faulty machine is not replying to the "who has" neighbor sollicitation.

As soon as we kill babeld on the faulty machine, the kernel starts to reply again and we can ping the machine.

Note 1: this can happen also on virtual interfaces created by openvpn tunnels.
Note 2: this can happen on different version of the kernel (we experienced it at least with up to date Debian 10 kernel and up to date Debian 9 kernel)
Note 3: we launch babel with the following command (option -X is not standard): babeld -h 15 -H 15 -L /var/log/re6stnet/babeld.log -S /var/lib/re6stnet/babeld.state -I /var/run/re6stnet/babeld.pid -s -C ipv6-subtrees true -C redistribute local deny -C redistribute ip 2001:67c:1254:e:49::1/80 eq 80 -C default max-rtt-penalty 5000 rtt-max 500 rtt-decay 125 -C redistribute deny -C install pref-src 2001:67c:1254:e:49::1 -X /var/run/re6stnet/babeld.sock re6stnet-tcp re6stnet1 re6stnet2 re6stnet3 re6stnet4 re6stnet5 re6stnet6 re6stnet7 re6stnet8 re6stnet9 re6stnet10 eth0
Note 4: the problem happen also with latest babeld + hmac feature (we merged master branch in hmac branch)

On the neighbour of the failing machine, we can see:

root@hydro66-leopard-000:~# ip -6 r get 2001:67c:1254:47::1
2001:67c:1254:47::1 from :: via fe80::e61d:2dff:fe52:eae4 dev ens1 proto babel src 2001:67c:1254:16::1 metric 1024 pref medium
root@hydro66-leopard-000:~# ping6 fe80::e61d:2dff:fe52:eae4%ens1
PING fe80::e61d:2dff:fe52:eae4%ens1(fe80::e61d:2dff:fe52:eae4%ens1) 56 data bytes
From fe80::f652:14ff:fefc:a494%ens1: icmp_seq=1 Destination unreachable: Address unreachable
From fe80::f652:14ff:fefc:a494%ens1: icmp_seq=2 Destination unreachable: Address unreachable
From fe80::f652:14ff:fefc:a494%ens1: icmp_seq=3 Destination unreachable: Address unreachable
From fe80::f652:14ff:fefc:a494%ens1: icmp_seq=4 Destination unreachable: Address unreachable

Here is a tcpdump on the failing machine:

09:22:12.780331 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:feb0:912c > ff02::1:ffa7:12c4: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fea7:12c4
          source link-address option (1), length 8 (1): f4:52:14:b0:91:2c
            0x0000:  f452 14b0 912c
09:22:12.871200 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::e61d:2dff:fe52:c17a > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): e4:1d:2d:52:c1:7a
            0x0000:  e41d 2d52 c17a
09:22:12.907397 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fef8:d5d0 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:f8:d5:d0
            0x0000:  f452 14f8 d5d0
09:22:12.967731 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fefc:a494 > ff02::1:ff52:eae4: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::e61d:2dff:fe52:eae4
          source link-address option (1), length 8 (1): f4:52:14:fc:a4:94
            0x0000:  f452 14fc a494
09:22:13.376019 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fea6:f3e4 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:a6:f3:e4
            0x0000:  f452 14a6 f3e4
09:22:13.463601 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fefd:174 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:fd:01:74
            0x0000:  f452 14fd 0174
09:22:13.506580 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fea8:8b34 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:a8:8b:34
            0x0000:  f452 14a8 8b34
09:22:13.509189 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fefc:5bce > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:fc:5b:ce
            0x0000:  f452 14fc 5bce
09:22:13.522059 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fefa:1a06 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:fa:1a:06
            0x0000:  f452 14fa 1a06
09:22:13.671159 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fea6:8b98 > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:a6:8b:98
            0x0000:  f452 14a6 8b98
09:22:13.803374 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:feb0:912c > ff02::1:ffa7:12c4: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fea7:12c4
          source link-address option (1), length 8 (1): f4:52:14:b0:91:2c
            0x0000:  f452 14b0 912c
09:22:13.857245 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::f652:14ff:fefc:5fac > ff02::1:fffc:3e58: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f652:14ff:fefc:3e58
          source link-address option (1), length 8 (1): f4:52:14:fc:5f:ac
            0x0000:  f452 14fc 5fac

After we kill babeld on the failing machine, we can ping it again:

root@hydro66-leopard-000:~# ping6 fe80::e61d:2dff:fe52:eae4%ens1
PING fe80::e61d:2dff:fe52:eae4%ens1(fe80::e61d:2dff:fe52:eae4%ens1) 56 data bytes
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=1 ttl=64 time=2030 ms
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=2 ttl=64 time=1024 ms
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=3 ttl=64 time=0.176 ms
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=4 ttl=64 time=0.108 ms
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=5 ttl=64 time=0.129 ms
64 bytes from fe80::e61d:2dff:fe52:eae4%ens1: icmp_seq=6 ttl=64 time=0.127 ms

Provide 1.8.5 release

Hi,

last 1.8.x release has been quite a while ago, and the bug solved in f7d8f67 causes me to patch manually in several projects.
Is there a chance for releasing a 1.8.5 just with the current state of the 1.8 stable branch?

Best

Adrian

assert failure

I got around to briefly testing jech head today. It blew up with an assert failure. I did not look into it harder, but can if needed.

...

My id ee:a8:6b:ff:fe:fe:09:a2 seqno 38769
Neighbour fe80::f6f2:6dff:feb6:a01c dev eno1 reach e000 ureach 0000 rxcost 96 txcost 96 rtt 3.868 rttcost 0 chan -2.
Neighbour fe80::46d9:e7ff:fe93:822e dev eno1 reach e000 ureach 0000 rxcost 96 txcost 96 rtt 1.634 rttcost 0 chan -2.
Neighbour fe80::e091:f5ff:febe:a353 dev eno1 reach e000 ureach 0000 rxcost 96 txcost 96 rtt 1.154 rttcost 0 chan -2.
Neighbour fe80::225:90ff:fec1:6252 dev eno1 reach e000 ureach 0000 rxcost 96 txcost 96 rtt 1.010 rttcost 0 chan -2.
Neighbour fe80::225:90ff:fec2:2aa3 dev eno1 reach e000 ureach 0000 rxcost 96 txcost 96 rtt 11.178 rttcost 0 chan -2.
::/0 metric 0 (exported)
2603:3024:1536:86f0::/64 metric 0 (exported)
172.22.0.0/24 from 0.0.0.0/0 metric 0 (exported)
::/0 from 2603:3024:1536:8600:f6f2:6dff:feb6:a01d/128 metric 192 (65535) refmetric 96 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (installed)
::/0 from 2603:3024:1536:86f0::/60 metric 192 (65535) refmetric 96 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (installed)
::/0 metric 2082 (65535) refmetric 1986 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec2:2aa3 (feasible)
::/0 metric 2082 (65535) refmetric 1986 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (feasible)
::/0 metric 2082 (65535) refmetric 1986 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::46d9:e7ff:fe93:822e (feasible)
0.0.0.0/0 metric 192 (65535) refmetric 96 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 nexthop 172.22.0.172 (installed)
50.197.142.144/29 metric 809 (65535) refmetric 713 id 02:0d:b9:ff:fe:43:a0:6c seqno 51873 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 nexthop 172.22.0.172 (installed)
172.20.0.0/14 metric 192 (65535) refmetric 96 id e2:91:f5:ff:fe:be:a3:54 seqno 38347 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 nexthop 172.22.0.172 (installed)
172.22.220.0/22 metric 192 (65535) refmetric 96 id 46:d9:e7:ff:fe:93:82:2d seqno 27834 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 nexthop 172.22.0.172 (installed)
2601:646:8301:6760::/60 metric 384 (65535) refmetric 288 id a2:21:b7:ff:fe:ac:e4:55 seqno 10074 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (installed)
2603:3024:1536:86f0::/60 metric 192 (65535) refmetric 96 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (installed)
2603:3024:1536:86f0::/64 metric 192 (65535) refmetric 96 id f6:f2:6d:ff:fe:b6:a0:1d seqno 45242 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (feasible)
fd89:cb20:8854::/48 metric 192 (65535) refmetric 96 id 46:d9:e7:ff:fe:93:82:2d seqno 27834 age 1 via eno1 neigh fe80::225:90ff:fec1:6252 (installed)
babeld: message.c:983: end_message: Assertion `buf->len >= bytes + 2 && buf->buf[buf->len - bytes - 2] == type && buf->buf[buf->len - bytes - 1] == bytes' failed.
Aborted (core dumped)

Error : Generating IPV6 Address

Greetings everyone.

I have followed the steps to install babeld. However, after reaching the stage of generating ipv6 address, it does give an error,

ip -6 addr add $(generate-ipv6-address -r)/128 dev wlan1

generate-ipv6-address: command not found
Error : inet6 prefix is expected rathen than "/128"

Apologies, as I am still a new in this area. Am I missing any steps here? Thank you.

Documentation needs improvement

Hey there. I've been trying to make use of this software over the years, and find that the man page documentation on how the routing works needs some clarification. There may also be obsolete options on there as well: the "wired" statement comes to mind. I also don't quite understand how the ge and le tags work on subnet masks; and it looks like "local" applies to the IPs, and not subnets. I'd be happy to help work on this effort, if I had some pointers. Thanks!

Feature request: exchange arbitrary strings

It would be a handy feature to exchange arbitrary strings between nodes to have a "sync" state of whatever downstream data could be of use. Some mesh communities exchange leases etc.

skip kernel_setup_interface when skip-kernel-setup is set

The skip-kernel-setup functions does not skip setting up the rp_filter setting in kernel_setup_interface. This prevents babeld from working as an unprivileged user with CAP_NET_ADMIN only.

Please make setting the rp_filter per interface conditional on skip-kernel-setup.

Installed IPv6 routes are not always added to kernel routing table

I'm currently on babeld 1.8.2 (debian/1.8.1-1-3-gc48aa52) and I'm sometimes missing IPv6 routes in the kernel routing table, which babel tells me are installed.

# echo "dump" | timeout 1 nc :: 33123 | grep "installed yes"
add route 560fbc5dae20 prefix 172.23.42.1/32 from 0.0.0.0/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5d7180 prefix 172.23.42.2/32 from 0.0.0.0/0 installed yes id c4:34:36:e4:f6:42:71:a0 metric 96 refmetric 0 via fe80::2 if wg-snafu
add route 560fbc5dcba0 prefix 172.23.42.8/32 from 0.0.0.0/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5dbd30 prefix 172.23.42.10/32 from 0.0.0.0/0 installed yes id 2a:d2:44:ff:fe:9d:d2:bf metric 116 refmetric 0 via fe80::2 if wg-eris
add route 560fbc5dcce0 prefix 172.23.42.64/26 from 0.0.0.0/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5dcd80 prefix 172.23.42.128/26 from 0.0.0.0/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5daf90 prefix 172.23.42.226/31 from 0.0.0.0/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5db030 prefix 172.23.42.238/31 from 0.0.0.0/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5db0d0 prefix 172.23.42.240/31 from 0.0.0.0/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5dc880 prefix fd42:23:42:100::/64 from ::/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5dc920 prefix fd42:23:42:110::/64 from ::/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5dad80 prefix fd42:23:42:b100::/56 from ::/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5dc010 prefix fd42:23:42:b200::/56 from ::/0 installed yes id c4:34:36:e4:f6:42:71:a0 metric 96 refmetric 0 via fe80::2 if wg-snafu
add route 560fbc5dcb40 prefix fd42:23:42:b800::/56 from ::/0 installed yes id 02:0d:b9:ff:fe:49:cc:f8 metric 96 refmetric 0 via fe80::1 if wg-io
add route 560fbc5dbcd0 prefix fd42:23:42:ba00::1/128 from ::/0 installed yes id 2a:d2:44:ff:fe:9d:d2:bf metric 116 refmetric 0 via fe80::2 if wg-eris
add route 560fbc5db670 prefix fd42:23:42:ff01::/64 from ::/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5db710 prefix fd42:23:42:ff07::/64 from ::/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch
add route 560fbc5dc9c0 prefix fd42:23:42:ff08::/64 from ::/0 installed yes id d4:57:67:c3:ff:0b:d5:f9 metric 96 refmetric 0 via fe80::2 if wg-glitch

That is 9 IPv4 and 9 IPv6 routes which are supposed to be installed to the kernel routing table.

The export-table is configured as follows:

export-table 100  # igp

However only one of the IPv6 routes is actually installed.

# ip -6 r s t 100 
fd42:23:42:ff07::/64 via fe80::2 dev wg-glitch proto babel metric 1024 onlink pref medium

While all of the IPv4 routes are properly installed:

# ip -4 r s t 100 
172.23.42.1 via 172.23.42.231 dev wg-glitch proto babel onlink 
172.23.42.2 via 172.23.42.233 dev wg-snafu proto babel onlink 
172.23.42.8 via 172.23.42.224 dev wg-io proto babel onlink 
172.23.42.10 via 172.23.42.235 dev wg-eris proto babel onlink 
172.23.42.64/26 via 172.23.42.224 dev wg-io proto babel onlink 
172.23.42.128/26 via 172.23.42.224 dev wg-io proto babel onlink 
172.23.42.226/31 via 172.23.42.231 dev wg-glitch proto babel onlink 
172.23.42.238/31 via 172.23.42.231 dev wg-glitch proto babel onlink 
172.23.42.240/31 via 172.23.42.231 dev wg-glitch proto babel onlink 

Possibly related log output, summed up by process and message:

# journalctl -u babeld | cut -d' ' -f5- | sort | uniq -c
      1 Thu 2018-06-07 19:32:31 CEST, end at Tue 2018-06-12 21:19:33 CEST. --
      7 babeld[1976]: kernel_route(ADD): No such device
      2 babeld[1976]: kernel_route(FLUSH): No such process
     85 babeld[1976]: kernel_route(MODIFY metric): No such device
  27413 babeld[1976]: send: Destination address required
     66 babeld[21726]: kernel_route(ADD): No such device
     20 babeld[21726]: kernel_route(FLUSH): No such process
    589 babeld[21726]: kernel_route(MODIFY metric): No such device
  85511 babeld[21726]: send: Destination address required
      1 systemd[1]: Started Babel routing daemon.
      1 systemd[1]: Stopped Babel routing daemon.
      1 systemd[1]: Stopping Babel routing daemon...

This happens after some hours of babeld running. As soon as I restart babeld all routes are back as they should:

# ip -6 r s t 100
fd42:23:42:100::/64 via fe80::1 dev wg-io proto babel metric 1024 onlink pref medium
fd42:23:42:110::/64 via fe80::1 dev wg-io proto babel metric 1024 onlink pref medium
fd42:23:42:b100::/56 via fe80::2 dev wg-glitch proto babel metric 1024 onlink pref medium
fd42:23:42:b200::/56 via fe80::2 dev wg-snafu proto babel metric 1024 onlink pref medium
fd42:23:42:b800::/56 via fe80::1 dev wg-io proto babel metric 1024 onlink pref medium
fd42:23:42:ba00::1 via fe80::2 dev wg-eris proto babel metric 1024 onlink pref medium
fd42:23:42:ff01::/64 via fe80::1 dev wg-io proto babel metric 1024 onlink pref medium
fd42:23:42:ff07::/64 via fe80::2 dev wg-snafu proto babel metric 1024 onlink pref medium
fd42:23:42:ff08::/64 via fe80::2 dev wg-glitch proto babel metric 1024 onlink pref medium

feature request: get/set more information at runtime

I'm currently doing that openwrt ubus stuff. And it is working very good for me. I would like to have more control about the filter rules and prefixes I announce at runtime.

I would like to expose your global variables

static struct filter *input_filters = NULL;
static struct filter *output_filters = NULL;
static struct filter *redistribute_filters = NULL;
static struct filter *install_filters = NULL;

as output, since they contain the important information I would like to access. Maybe u want to expose them via your webserver, too?

Next step for me is to expose add_filter via our ipc.
For me a function is missing that inserts a filter rule at a specific position.

Maybe u have same interests and can take part of some code part? Otherwise, I can write the code, but I would like to get as much upstream, as I can. ;) So I want to hear your opinion.

feature request: smoothly update source's metric.

I runing babeld over a public internet with tunnel mode. Everything works fine except that sometimes babeld ignore better route. Here is the example:
image
The installed route's metric is much more higher than the third one.

Then i found out that except the direct-connect routes (refmetric==0), all other routes are unfeasible. Here is the debug log:
image

I believe this is caused by huge network flipping. The latency between Beijing and Hongkong meight be 40ms at noon and 120 ms at night. But babeld only record the minimal metric as Feasible Distance, so the better route will never be select at night.

Is it possible that the source's metric can grow up slowly when network become lagging?

babeld incorrectly handles retraction updates when no next-hop TLV precedes the update TLV

This is a follow up to #66.

Babeld still logs quite a lot of "Couldn't parse packet" messages when connecting to a bird peer. This is because bird does not include next hop information in before retractions, while babeld requires them.
(Bird also does not evaluate next-hop information when retracting routes)

This results in error messages here:

babeld/message.c

Lines 730 to 738 in 231b280

if(message[2] == 1) {
if(!have_v4_nh)
goto fail;
nh = v4_nh;
} else if(have_v6_nh) {
nh = v6_nh;
} else {
nh = neigh->address;
}

But babeld does indeed evaluate the next-hop when searching for retracting routes, so this check cannot simply be removed for retractions:

babeld/route.c

Line 901 in 231b280

route = find_route(prefix, plen, src_prefix, src_plen, neigh, nexthop);

babeld/route.c

Lines 143 to 147 in 231b280

while(route) {
if(route->neigh == neigh && memcmp(route->nexthop, nexthop, 16) == 0)
return route;
route = route->next;
}

I've asked on bird-users if they think ignoring next-hop is correct, because I couldn't find any hints about this in RFC 6126.
They replied, that RFC6126bis does indeed say that next-hop is not used in case an update is a retraction, so it looks like this was changed at some point in time.

Should next-hop information be ignored by babeld as well or has there been some significant changes to the babel protocol in RFC6126bis, which makes it incompatible to RFC6126?

Issue with no_hmac_verify flag

When starting from a network that does not use the HMAC that is in development, we consider that a proper way to enable it is to do it in 2 steps, so that all nodes don't need to restart with the new configuration at the same time.

The intermediate state is to have a network that is made of 2 kinds to nodes: those with initial configuration (no hmac settings at all), and other that sign packets but accept unsigned packets. For that, no_hmac_verify works.

Then, once all nodes use the intermediate configuration, we want nodes to restart without the no_hmac_verify settings. We started with a single node and found that it couldn't communicate with neighbours. babeld.log is full of _Received wrong PC or failed the challenge. _ messages.

This whole scenario worked with #35. Here is a patch that adapts this PR for the current hmac branch:

fixup! Add no_hmac_verify flag

Commit a62b7c9b6dbab378d69a523a10a9de33c2624fbb was broken in that
2 nodes with same id/hmac settings could not communicate when
1 of the 2 has no_hmac_verify.

diff --git a/hmac.c b/hmac.c
index 769daf4..cb07416 100644
--- a/hmac.c
+++ b/hmac.c
@@ -259,6 +259,7 @@ check_hmac(const unsigned char *packet, int packetlen, int bodylen,
 {
     int i = bodylen + 4;
     int len;
+    int rc = -1;
 
     debugf("check_hmac %s -> %s\n",
           format_address(src), format_address(dst));
@@ -278,8 +279,9 @@ check_hmac(const unsigned char *packet, int packetlen, int bodylen,
                               packet + i + 2, len, ifp->key);
            if(ok)
                return 1;
+           rc = 0;
        }
        i += len + 2;
     }
-    return 0;
+    return rc;
 }
diff --git a/message.c b/message.c
index c57a0d2..4ea1aea 100644
--- a/message.c
+++ b/message.c
@@ -580,21 +580,24 @@ parse_packet(const unsigned char *from, struct interface *ifp,
         bodylen = packetlen - 4;
     }
 
-    if(ifp->key != NULL && !(ifp->flags & IF_NO_HMAC_VERIFY)) {
-        if(check_hmac(packet, packetlen, bodylen, from, to, ifp) != 1) {
-            fprintf(stderr, "Received wrong hmac.\n");
-            return;
-        }
-
-        neigh = find_neighbour(from, ifp);
-        if(neigh == NULL) {
-            fprintf(stderr, "Couldn't allocate neighbour.\n");
-            return;
-        }
-
-        if(preparse_packet(packet, bodylen, neigh, ifp) == 0) {
-            fprintf(stderr, "Received wrong PC or failed the challenge.\n");
-            return;
+    if(ifp->key != NULL) {
+        switch(check_hmac(packet, packetlen, bodylen, from, to, ifp)) {
+            case -1:
+                if(ifp->flags & IF_NO_HMAC_VERIFY)
+                    break; /* missing key ignored */
+            case 0:
+                fprintf(stderr, "Received wrong hmac.\n");
+                return;
+            case 1:
+                neigh = find_neighbour(from, ifp);
+                if(neigh == NULL) {
+                    fprintf(stderr, "Couldn't allocate neighbour.\n");
+                    return;
+                }
+                if(preparse_packet(packet, bodylen, neigh, ifp) == 0) {
+                    fprintf(stderr, "Received wrong PC or failed the challenge.\n");
+                    return;
+                }
         }
     }
 

But no_hmac_verify does not even seem to be a good name. I have a preference for the name we chose in #35: the goal is not to skip checking but rather ignore when there's no HMAC.

monitor events are send twice?

The neighbour change event is often send twice?
For example you directly get:

...
change neighbour fff38c3a80 address fe80::43 if wg_wilgu10 reach ffff ureach 0000 rxcost 1024 txcost 256 cost 256
change neighbour fff38c3a80 address fe80::43 if wg_wilgu10 reach ffff ureach 0000 rxcost 1024 txcost 256 cost 256
...

obscure error message: babeld: send: Destination address required

Hi Juliusz,

one of my babeld routers (running 1.12.1-1 from Debian) has started to print this obscure error every second or so

babeld[11278]: send: Destination address required

Not sure what is going on here, just opening this so maybe I remember to debug it. My guess is it's coming from flushbuf() and the error case there should probably be a bit more verbose.

--Daniel

segfault due to inifinite loop in hmac branch

we are using babeld on hmac branch with the following options -C key type blake2s id sign value XXXXXXXXXXXXXXXXX -C default hmac sign no_hmac_verify true. From time to time, we see babeld crashing with a segfault.

We managed to have a coredump file, and the problem is flushbuf (called from main function) calls send_crypto_seqno which calls start_message which in turn calls flushbuf in a infinite loop.

Here are the information from gdb:

Core was generated by `babeld -h 15 -H 15 -L /var/log/re6stnet/babeld.log -S /var/lib/re6stnet/babeld.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000556e5c0f8b57 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1147
(gdb) bt
#0  0x0000556e5c0f8b57 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1147
#1  0x0000556e5c0f8c2b in flushbuf (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1038
#2  0x0000556e5c0f8d95 in start_message (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680, type=type@entry=17, len=len@entry=12) at message.c:1103
#3  0x0000556e5c0f8b84 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1154
#4  0x0000556e5c0f8c2b in flushbuf (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1038
#5  0x0000556e5c0f8d95 in start_message (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680, type=type@entry=17, len=len@entry=12) at message.c:1103
#6  0x0000556e5c0f8b84 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1154
#7  0x0000556e5c0f8c2b in flushbuf (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1038
#8  0x0000556e5c0f8d95 in start_message (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680, type=type@entry=17, len=len@entry=12) at message.c:1103
#9  0x0000556e5c0f8b84 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1154
#10 0x0000556e5c0f8c2b in flushbuf (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1038
#11 0x0000556e5c0f8d95 in start_message (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680, type=type@entry=17, len=len@entry=12) at message.c:1103
[....]
(gdb) up 100000000
#261944 0x0000556e5c0f0645 in main (argc=<optimized out>, argv=<optimized out>) at babeld.c:826
826                         flushbuf(&neigh->buf, neigh->ifp);
(gdb) down
#261943 0x0000556e5c0f8c2b in flushbuf (buf=0x556e5d406e10, ifp=0x556e5d3ff680) at message.c:1038
1038                send_crypto_seqno(buf, ifp);
(gdb) p *buf
$12 = {
  sin6 = {
    sin6_family = 10, 
    sin6_port = 10266, 
    sin6_flowinfo = 0, 
    sin6_addr = {
      __in6_u = {
        __u6_addr8 = "\376\200\000\000\000\000\000\000Lf\214\377\376u\366\030", 
        __u6_addr16 = {33022, 0, 0, 0, 26188, 65420, 30206, 6390}, 
        __u6_addr32 = {33022, 0, 4287391308, 418805246}
      }
    }, 
    sin6_scope_id = 56
  }, 
  buf = 0x556e5d403740 "\n\026\002@i\330\177", 
  len = 1368, 
  size = 1436, 
  flush_interval = 7500, 
  timeout = {
    tv_sec = 10498367, 
    tv_usec = 127533
  }, 
  have_id = 0 '\000', 
  have_nh = 0 '\000', 
  have_prefix = 0 '\000', 
  id = "\000\000\000\000\000\000\000", 
  nh = "\000\000\000", 
  prefix = '\000' <repeats 15 times>, 
  hello = -1
}
(gdb) p *ifp
$13 = {
  next = 0x0, 
  conf = 0x556e5d3fe910, 
  ifindex = 56, 
  flags = 321, 
  cost = 96, 
  channel = -2, 
  hello_timeout = {
    tv_sec = 10498368, 
    tv_usec = 206016
  }, 
  update_timeout = {
    tv_sec = 10498411, 
    tv_usec = 673029
  }, 
  update_flush_timeout = {
    tv_sec = 0, 
    tv_usec = 0
  }, 
  name = "re6stnet10\000\000\000\000\000", 
  ipv4 = 0x0, 
  numll = 1, 
  ll = 0x556e5d401fa0, 
  buf = {
    sin6 = {
      sin6_family = 10, 
      sin6_port = 10266, 
      sin6_flowinfo = 0, 
      sin6_addr = {
        __in6_u = {
          __u6_addr8 = "\377\002", '\000' <repeats 11 times>, "\001\000\006", 
          __u6_addr16 = {767, 0, 0, 0, 0, 0, 256, 1536}, 
          __u6_addr32 = {767, 0, 0, 100663552}
        }
      }, 
      sin6_scope_id = 56
    }, 
    buf = 0x556e5d4019f0 "\006\n", 
    len = 0, 
    size = 1436, 
    flush_interval = 7500, 
    timeout = {
      tv_sec = 0, 
      tv_usec = 0
    }, 
    have_id = 0 '\000', 
    have_nh = 0 '\000', 
    have_prefix = 0 '\000', 
    id = "\362\336\361\377\376\300\001\070", 
    nh = "\000\000\000", 
    prefix = "$\001Q\200\000\000\000\275\000\000\000\000\000\000\000", 
    hello = -1
  }, 
  buffered_updates = 0x0, 
  num_buffered_updates = 0, 
  update_bufsize = 0, 
  last_update_time = 10498341, 
  hello_seqno = 47223, 
  hello_interval = 15000, 
  update_interval = 60000, 
  rtt_decay = 125, 
  rtt_min = 10000, 
  rtt_max = 500000, 
  max_rtt_penalty = 5000, 
  key = 0x556e5d3fe1c0, 
  pc = 33480, 
  index = "-\214K\221\001\037\202'"
}
(gdb) down
#261939 0x0000556e5c0f8b84 in send_crypto_seqno (buf=buf@entry=0x556e5d406e10, ifp=ifp@entry=0x556e5d3ff680) at message.c:1154
1154        start_message(buf, ifp, MESSAGE_CRYPTO_SEQNO, 4 + INDEX_LEN);
(gdb) p (buf->size - buf->len)
$16 = 68

The exact source code used include patches from Nexedi and can be found here : https://lab.nexedi.com/nexedi/babeld/tree/hmac-nxd2

1.12 tag missing

Thanks for the new release, unfortunately it wasn't tagged on Git yet.

babeld improvement opportunities

Hello,
first of all, kudos for creating this project!

Recently I got myself into contributing to Althea's fork of babeld and found some places around the upstream codebase that might hold potential for improvements. But on the other hand, maybe there's code which I simply don't understand, hence this issue. I'd really appreciate it if you could share your thoughts on this. Thank you!

Ring buffers in format functions

  • What: The format* functions in util.h use static ring buffers for holding 4 last return values, e.g.:
const char *
format_thousands(unsigned int value)
{
    static char buf[4][15]; // I mean buffers like this one
    static int i = 0;
    i = (i + 1) % 4;
    snprintf(buf[i], 15, "%d.%.3d", value / 1000, value % 1000);
    return buf[i];
}
  • Why: This doesn't really help with anything and could cause printing bugs that are termendously difficult to catch.
  • How: A very common and battle-tested approach is to have the user supply their own, e.g.:
char *
format_thousands(unsigned int value, char *buf, size_t buflen)
{
    if (!buf) {
        if (!buflen) {
        /* They don't supply the buffer and specify a 0 size, therefore we're allocating it for them. 
         * This of course is optional as we could simply return NULL whenever !buf holds true
         */
            buf = malloc(15);
            buflen = 15;
        } else {
            return NULL; // They don't give us a valid buffer and yet define the length.
        }
    }
    snprintf(buf, buflen, "%d.%.3d", value / 1000, value % 1000);
    return buf;
}

Many pointers are not null-checked and some symbols are not used

  • What: In some functions around the codebase pointer sanity is not checked, e.g.:
static int
kernel_rule_notify(struct kernel_rule *rule, void *closure) // closure is not used
{
    int i;
    if(martian_prefix(rule->src, rule->src_plen)) // Risky business, rule could be NULL
        return 0;

    i = rule->priority - src_table_prio;

    if(i < 0 || SRC_TABLE_NUM <= i)
        return 0;

    kernel_rules_changed = 1;
    return -1;
}
  • Why: It's a common practice to explicitly react to null pointers where there should be process-owned memory. Many developers expect this and it's also smart to put error logging in the event of an unexpected NULL pointers occurring
  • How: Let's just surround all pointer parameters with if statements that check if they're NULL

Coverity bug report

I ran a Coverity scan of babeld master (0835d5d), ruled out false
positives and ended up with 5 reported bugs.

  • 277067, 277064, 227087 are out-of-bounds read/writes with too
    lengthy path to analyze before the reported bug is triggered, they
    might be false positives;
  • 277072, fixed in cf5e98b;
  • 277081, a potential undefined behaviour caused by a right-shift of
    16 bits on a short when too many hellos are missed.

babeld incorrectly handles retraction updates when no router-id precedes the update TLV

We observed quite a lot of "Received prefix with no router id" + "Couldn't parse packet" log messages in a relatively large babel network lately. This only affects interfaces with bird neighbours.

After some debugging I was able to identify that this problem is caused by messages sent by bird containing Update TLVs with metric 65535 (retraction), but without a preceding router-id. This causes babeld to ignore this update at

babeld/message.c

Lines 721 to 724 in aa29624

if(!have_router_id && message[2] != 0) {
fprintf(stderr, "Received prefix with no router id.\n");
goto fail;
}

The babel-RFC doesn't seem to require Update TLVs to be preceeded by a router-id TLV, if the update is a retraction (metric == 0xffff).
It looks like babeld is handling this case correctly for wildcard retractions by excluding AE == 0 from this error check. This check probably also needs to exclude metric == 0xffff, but I haven't verified if that results in correct behavior.

kernel_route: Invalid argument

Hi, I'm using babeld 1.10 which is from yum on CentOS 7 (kernel version 3.10.0-1160.49.1.el7.x86_64).
When I see the log of systemctl, I find that it keeps outputing kernel_route(ADD): Invalid argument and kernel_route(MODIFY metric): Invalid argument.
Is there any way to fix it without upgrading the kernel? Thank you.

feature request: add prio parameter for filtering rules

So far, filtering rules are evaluated based on their order in the configuration (or uci config for OpenWrt).

This is rather inconvenient when using scripts to alter these configurations (particularly for uci config in OpenWrt), as it may be hard to control the position of a specific entry without rewriting everything again in the new order. A common case is when one wants to add redistribute filters for local routes before the "redistribute local deny" filter that needs to be last.

Since the config is only evaluated "at start" and not updated dynamically, it should be simple enough to provide an ordering of the filtering rules by a prio parameter when reading them (of course, with a default prio where the order will be used again as default).

This would be a great help.

feature request: do not mark routes as unreachable when some neighbors fall in failed

I noticed that if a neighbor doesn't response hello packet any more, babeld will mark its routes as unreachable very soon and unblock them after the routes converge again. If a neighbor is exiting normally, babeld will route the traffic to another node in milliseconds. But if the neighbor is exiting unexpectedly, there's an annoying time window and the connection will be timeout and disconnected.
I know this feature is to avoid routing loop, but could please give a option to close this "feature"? I want a high availability network with automatically routing switching.

Same route announced twice?

I have a config that looks like this:

config general
	option 'local_port' '33123'
	option ipv6_subtrees 'true'
        option ubus_bindings 'true'

config interface
        option ifname 'wlan0'

config filter
	option 'type'	'redistribute'
	option 'ip'	'::/0'

config filter
	option 'type'	'redistribute'
	option 'ip'	'0.0.0.0/0'
	option 'eq'	'32'

However, now the routes look like this:

add xroute 10.12.1.101/32-0.0.0.0/0 prefix 10.12.1.101/32 from 0.0.0.0/0 metric 0
add xroute 10.13.1.101/32-0.0.0.0/0 prefix 10.13.1.101/32 from 0.0.0.0/0 metric 0

xroutes generated from local addresses are missing v4mapped src_prefix

Routes imported from kernel had an incorrectely set src_prefix/src_plen before the issue was fixed in 68b6c5e

Just like it, src_prefix/src_plen isn't encoded properly for routes generated from local addresses.

This leads to issues if the same route/address is announced by multiple babeld nodes, as the route generated locally and the route received from a neighbour are not equal internally.

runtime error: null pointer passed to memcpy

I was running experimental code and compiled babeld with Address and Undefined Behaviour sanitizers. I encountered the following error, which I believe is not related to my experimental code.

runtime error: null pointer passed as argument 2, which is declared to never be null

babeld/message.c

Lines 1428 to 1429 in a104387

memcpy(channels, route->channels,
MIN(route->channels_len, MAX_CHANNEL_HOPS));

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.