<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">MICE (IXP in Minneapolis, MN, USA) has
had a couple reports of participants seeing BFD issues.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">For the most recent example, the
participant's router is sending BFD to the route server (confirmed
via a packet capture on the route server), but BIRD simply did not
respond. Both their IPv4 and IPv6 sessions to rs1 (the first route
server) were acting this way, while both of their sessions to rs2
were fine. I compared the route servers and found two other
inconsistencies: one network with IPv4 and one network with IPv6.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Here's what the packet capture shows:<br>
</div>
<blockquote>
<div class="moz-cite-prefix"><font face="monospace">BFD Control
message</font></div>
<div class="moz-cite-prefix"><font face="monospace"> 001. .... =
Protocol Version: 1</font></div>
<div class="moz-cite-prefix"><font face="monospace"> ...0 0011 =
Diagnostic Code: Neighbor Signaled Session Down (0x03)</font></div>
<div class="moz-cite-prefix"><font face="monospace"> 01.. .... =
Session State: Down (0x1)</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Message
Flags: 0x48, Control Plane Independent: Set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> 0... .. =
Poll: Not set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> .0.. .. =
Final: Not set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> ..1. .. =
Control Plane Independent: Set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> ...0 .. =
Authentication Present: Not set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> .... 0. =
Demand: Not set</font></div>
<div class="moz-cite-prefix"><font face="monospace"> .... .0 =
Multipoint: Not set<br>
</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Detect Time
Multiplier: 3 (= 6000 ms Detection time)</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Message
Length: 24 bytes</font></div>
<div class="moz-cite-prefix"><font face="monospace"> My
Discriminator: 0x0000024d</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Your
Discriminator: 0x0c764401</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Desired Min
TX Interval: 2000 ms (2000000 us)</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Required Min
RX Interval: 2000 ms (2000000 us)</font></div>
<div class="moz-cite-prefix"><font face="monospace"> Required Min
Echo Interval: 0 ms (0 us)</font><br>
</div>
</blockquote>
<div class="moz-cite-prefix"><br>
</div>
This may have started when rs1 was rebooted for patching, or it
might have been due to a fiber cut that disconnected this
participant from the fabric. We are not sure exactly on the timing.
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">The participant reset their BGP
sessions. This caused the BFD session to come up.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">We are running BIRD 2.0.8 from Ubuntu's
repository on Ubuntu 22.04. Note that this BIRD setup is using the
IXP Manager templates to generate the config, so there are
separate BIRD processes for IPv4 vs IPv6.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">A different participant noted they had
seen BFD issues in the past. They noted this happened in March.
That would have been the prior route servers, which were BIRD 1.x
(I believe) on FreeBSD.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Note that BFD is configured as passive
(i.e. the participant has to initiate it). Relevant config bits:</div>
<blockquote>
<div class="moz-cite-prefix"><font face="monospace">protocol bfd<br>
{<br>
accept ipv4 direct;<br>
interface "en*" {<br>
passive on;<br>
multiplier 3;<br>
min rx interval 500ms;<br>
min tx interval 500ms;<br>
};<br>
}<br>
<br>
...</font></div>
<div class="moz-cite-prefix"><font face="monospace"><br>
</font></div>
<font face="monospace">
</font><font face="monospace">protocol bgp pb_0138_as18451 from
tb_rsclient {</font><br>
<font face="monospace"> description "AS18451 - LES.NET";</font><br>
<font face="monospace"> neighbor 206.108.255.175 as 18451;</font><br>
<font face="monospace"> ipv4 {</font><br>
<font face="monospace"> import limit 120 action
restart;</font><br>
<font face="monospace"> import filter f_import_as18451;</font><br>
<font face="monospace"> table t_0138_as18451;</font><br>
<font face="monospace"> export filter f_export_as18451;</font><br>
<font face="monospace"> };</font><br>
<font face="monospace"> bfd on;</font><br>
<font face="monospace">}</font><br>
</blockquote>
<div class="moz-cite-prefix">
<div class="moz-cite-prefix"><br>
</div>
</div>
<div class="moz-cite-prefix">Are there interesting fixes since
2.0.8? I looked at the git commit log and I see this (but it
doesn't seem like it would apply):</div>
<blockquote><font face="monospace">commit
99872676df45f1a490d3d63f43081afb41477040</font><br>
<font face="monospace">Author: Ondrej Zajicek
<a class="moz-txt-link-rfc2396E" href="mailto:santiago@crfreenet.org"><santiago@crfreenet.org></a></font><br>
<font face="monospace">Date: Sun Jan 22 23:42:08 2023 +0100</font><br>
<br>
<font face="monospace"> BFD: Improve incoming packet matching</font><br>
<font face="monospace"> </font><br>
<font face="monospace"> For active sessions, ignore received
packets with zero local id and</font><br>
<font face="monospace"> mismatched remote id. That forces a
session timeout instead of an</font><br>
<font face="monospace"> immediate session restart. It makes BFD
sessions more resilient to</font><br>
<font face="monospace"> packet spoofing.</font><br>
<font face="monospace"> </font><br>
<font face="monospace"> Thanks to André Grüneberg for the
suggestion.</font><br>
</blockquote>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">The discussion was here:</div>
<div class="moz-cite-prefix"><a moz-do-not-send="true"
href="https://lists.iphouse.net/cgi-bin/wa?A2=ind2306&L=MICE-DISCUSS&T=0&F=&S=&P=17401">https://lists.iphouse.net/cgi-bin/wa?A2=ind2306&L=MICE-DISCUSS&T=0&F=&S=&P=17401</a></div>
<pre class="moz-signature" cols="72">--
Richard</pre>
</body>
</html>