RFT: Babel RTT extension in Bird
Stefan Haller
stefan.haller at stha.de
Fri Apr 22 17:08:48 CEST 2022
Hi Toke,
On Fri, Apr 22, 2022 at 01:48:46AM +0200, Toke Høiland-Jørgensen wrote:
> I've implemented the Babel RTT extension specified in
> draft-ietf-babel-rtt-extension in Bird. I've tested that it talks to
> babeld on a single link and that the two implementations agree on each
> others' (smoothed) RTT values. However, I'd like to subject the code to
> some more tortured testing before submitting it to upstream Bird. So I'm
> sending this note as a request for testing.
Nice work! I replaced the bird binary and changed the interface type to
"tunnel" on a mesh of four hosts. Works great so far!
Things I noticed:
(1) When I forgot to change the config file (one side was type tunnel,
on side was type wired), the Babel neighbor metric was stuck on 65535. I
think this happens because the expected time stamp was not received and
then the metric computation does not work. While I understand that such
a "broken" setup is not really supported, it was not exactly clear where
to locate the problem.
(2) Also I think it would be neat if "birdc show babel neigh" would show
latency info (current latency + smoothed value).
(3) Due to route flapping I tried to increase "metric decay" to 60s.
After running "birdc configure" the values became very large for one
link (on one side only).
> bird: babel1: RTT sample for neighbour fe80::3 on wg2: 4294966323 us (srtt 99189.162 ms)
> bird: babel1: Added RTT cost 96 to nbr fe80::3 on wg2 with srtt 99189.162 ms
Nothing changed after >1h. The opposite side was reporting sensible RTT
numbers. After I restarted the daemon, the smoothed value was still off
for this one link:
> bird: babel1: RTT sample for neighbour fe80::3 on wg2: 1241 us (srtt 69656.646 ms)
> bird: babel1: Added RTT cost 96 to nbr fe80::3 on wg2 with srtt 69656.646 ms
The srtt value did not converge after >1h. For all other links the
smoothing works, e.g. for wg1 on the very same host:
> bird: babel1: RTT sample for neighbour fe80::1 on wg0: 14570 us (srtt 15.876 ms)
> bird: babel1: Added RTT cost 5 to nbr fe80::1 on wg0 with srtt 15.876 ms
After restarting bird once more (without changing anything) it works
since then:
> bird: babel1: RTT sample for neighbour fe80::3 on wg2: 1261 us (srtt 1.313 ms)
In this setup wg2 was a tunnel over the local LAN, so latency was often
< 1000 us. Maybe there is a problem for tiny latencies and/or larger values
of "metric decay"? I did not find a way to reliably reproduce the problem.
Best regards,
Stefan Haller
More information about the Bird-users
mailing list