BIRD continues exporting routes but reports no exports

Hugo Slabbert hugo.slabbert at menlosecurity.com
Wed Mar 15 21:08:29 CET 2023


Closing the loop on this:

A bit over a week in, any of the boxes on 2.0.7 consistently reproduce the
issue (need `configure` issued twice to actually pick up the export policy
changes) whereas the test boxes on 2.0.10 are still consistently behaving
properly (single `configure` properly updates exports).

On Wed, Mar 8, 2023 at 2:20 PM Hugo Slabbert <
hugo.slabbert at menlosecurity.com> wrote:

> *nods*
>
> We generally run distro packages unless we have specific requirements for
> newer features or fixes etc.
>
> Current experiments show that the test routers on 2.0.7 started to
> reproduce this issue after about 3 days since BIRD's process start, whereas
> the test boxes we have on 2.0.10 have not yet reproduced the issue after
> about 3 days and 5 days, respectively.
>
> It's always tough to prove a negative, but we'll keep it running through
> the weekend to validate and look at bumping to 2.0.10 across the fleet if
> we're still clear on the 2.0.10 boxes at that point.
>
> On Wed, Mar 8, 2023 at 2:17 PM Ross Tajvar <ross at tajvar.io> wrote:
>
>> By the way - compiling the most recent version of bird is very easy. So
>> even though there's not a package for 2.0.12 for bullseye, I recommend just
>> compiling with the same options as the bird package and running that.
>>
>> On Fri, Mar 3, 2023, 2:07 PM Hugo Slabbert via Bird-users <
>> bird-users at network.cz> wrote:
>>
>>> Ah; thanks. Okay, I was misreading that as just referring to regular
>>> table filtering, not in conjunction with import/export.  I had looked at `show
>>> symbols table` and not seen any indication of it, but missed that these
>>> are present in the `show route export table <p.c>` format regardless.
>>>
>>> Thanks. That confirms that we do in fact see a difference there between
>>> the export table and the ad hoc route export view when this occurs, after a
>>> single call to `birdc configure` (scrubbed slightly here):
>>>
>>> ```
>>> bird> show route export gw_085ea85_euwest2
>>> bird>
>>> bird> show route export table gw_085ea85_euwest2.ipv4
>>> Table export:
>>> 57.140.1.0/24
>>> <https://isolate-menlo.menlosecurity.com/0/eJyrViotylGyUsooKSmw0tc3NdczNDHQM9Qz0DcyUdJRKspXsjLUUSrJTAGqMTSxNFeqBQBU6gyn>
>>>        unicast [<name of static source> 16:41:41.058] * (100)
>>>         via <next hop> on bond0 onlink
>>> ```
>>>
>>> We'll keep an eye here and validate if we do see this returning on
>>> 2.0.10 as well, or if 2.0.10 remains clear.
>>>
>>> On Fri, Mar 3, 2023 at 10:18 AM Alexander Zubkov <green at qrator.net>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> It is documented in recent versions and on the bird's site too. Pay
>>>> attention to this:
>>>>
>>>> [(import|export) table p.c]
>>>>
>>>> On Fri, Mar 3, 2023, 18:32 Hugo Slabbert via Bird-users <
>>>> bird-users at network.cz> wrote:
>>>>
>>>>> Right, so,
>>>>>
>>>>> I've gone ahead and enabled export tables on the channels for the
>>>>> relevant peers, per Alexander's suggestion for possibly getting additional
>>>>> visibility. I don't seem to spot any different views for route status,
>>>>> though. I don't see any particular docs on how to *view* export
>>>>> tables; does enabling export tables make a different view available to look
>>>>> at the export table contents specifically? Or does it just shift the
>>>>> behaviour so `show route export <protocol>` displays the export table
>>>>> contents rather than a point-in-time evaluation of export filters for the
>>>>> specified neighbor?
>>>>>
>>>>> Snippet showing export tables config enabled for the peers:
>>>>>
>>>>> ```
>>>>> template bgp GATEWAY_v6 {
>>>>>     hold time 6;
>>>>>     startup hold time 20;
>>>>>     connect delay time 3;
>>>>>     connect retry time 6;
>>>>>     error wait time 3, 12;
>>>>>     med metric;
>>>>>     allow local as 1;
>>>>>
>>>>>     local fdff::4:2 as MENLO_ASN;
>>>>>
>>>>>     ipv6 {
>>>>>         export table on;
>>>>>         import filter GATEWAY_IMPORT_v6;
>>>>>         export filter GATEWAY_EXPORT_v6;
>>>>>     };
>>>>>     ipv4 {
>>>>>         export table on;
>>>>>         extended next hop on;
>>>>>         add paths rx;
>>>>>         import filter GATEWAY_IMPORT_v4;
>>>>>         export filter GATEWAY_EXPORT_v4;
>>>>>     };
>>>>> }
>>>>> # ...
>>>>> protocol bgp gw_085ea85_euwest2 from GATEWAY_v6 {
>>>>>     neighbor fdff::8005:f4d1 as 65000;
>>>>> }
>>>>> ```
>>>>>
>>>>> I don't see any different behaviour on the affected hosts, though.
>>>>> E.g. a host that just had `configure` called once after setting the
>>>>> draining flag is showing these symptoms, showing nothing for `show
>>>>> route export <protocol>`:
>>>>>
>>>>> ```
>>>>> bird> show route export gw_085ea85_euwest2
>>>>> bird>
>>>>> ```
>>>>>
>>>>> ...but still showing exports under the protocol details:
>>>>>
>>>>> ```
>>>>> bird> show protocols all gw_085ea85_euwest2
>>>>> Name       Proto      Table      State  Since         Info
>>>>> gw_085ea85_euwest2 BGP        ---        up     2023-03-03 16:33:43
>>>>>  Established
>>>>>   BGP state:          Established
>>>>> # ...
>>>>>   Channel ipv6
>>>>>     State:          UP
>>>>>     Table:          master6
>>>>>     Preference:     100
>>>>>     Input filter:   GATEWAY_IMPORT_v6
>>>>>     Output filter:  GATEWAY_EXPORT_v6
>>>>>     Routes:         2 imported, 2 exported, 1 preferred
>>>>>     Route change stats:     received   rejected   filtered    ignored
>>>>>   accepted
>>>>>       Import updates:              3          0          1          0
>>>>>          2
>>>>>       Import withdraws:            0          0        ---          0
>>>>>          0
>>>>>       Export updates:            109          5         96        ---
>>>>>          8
>>>>>       Export withdraws:            2        ---        ---        ---
>>>>>          2
>>>>>     BGP Next hop:   fdff::4:2
>>>>>   Channel ipv4
>>>>>     State:          UP
>>>>>     Table:          master4
>>>>>     Preference:     100
>>>>>     Input filter:   GATEWAY_IMPORT_v4
>>>>>     Output filter:  GATEWAY_EXPORT_v4
>>>>>     Routes:         12 imported, 1 exported, 0 preferred
>>>>>     Route change stats:     received   rejected   filtered    ignored
>>>>>   accepted
>>>>>       Import updates:             12          0          0          0
>>>>>         12
>>>>>       Import withdraws:            0          0        ---          0
>>>>>          0
>>>>>       Export updates:             39          4         31        ---
>>>>>          4
>>>>>       Export withdraws:            0        ---        ---        ---
>>>>>          1
>>>>>     BGP Next hop:   fdff::4:2
>>>>> ```
>>>>>
>>>>> Note this is still on 2.0.7.  We've bumped some hosts to 2.0.10, but
>>>>> as indicated in the previous message, just a simple restart clears this
>>>>> issue from occurring.  We've enabled the export table config on both a
>>>>> 2.0.7 and a 2.0.10 host, to be able to possibly spot if this reoccurs on
>>>>> the 2.0.10 host as well after a period. An example host on 2.0.7 showing
>>>>> this behaviour has been up for ~2 weeks. The box upgraded to 2.0.10 has had
>>>>> BIRD running for just ~16 hours at this point and is not yet showing any
>>>>> issues.
>>>>>
>>>>> On Thu, Mar 2, 2023 at 4:07 PM Hugo Slabbert <
>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>
>>>>>> A slight update on this:
>>>>>>
>>>>>> 3f477ccb
>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJwNzcEOgjAMANB_6RmpC5tN9zdrx7AJiBnlovHf5fZu7wtnXyHD0_19ZMTFfC0yvkxH_eDFA8V6xRvqvm3mOLVIpCr3aa7M2h6UopSiJRA3bcRBtJaUYIC-Qw4DuNUrCJEJfn_7GSJQ>
>>>>>> does appear to be in 2.0.7, which we're running, so if that's the issue
>>>>>> that may not be the problem.
>>>>>>
>>>>>> This looked to be successful initially when upgrading to 2.0.10. But,
>>>>>> I then checked a box that was still running 2.0.7 and where we could repro
>>>>>> it. I simply restarted bird there, and then could no longer repro it.
>>>>>>
>>>>>> So, just restarting bird at 2.0.7 was sufficient to clear the
>>>>>> problem, at least temporarily, and the bump to 2.0.10 then wasn't a clear
>>>>>> test, given that's obviously a fresh instance of bird running.
>>>>>>
>>>>>> We'll try to validate if the problem eventually returns on the 2.0.7
>>>>>> box(es) after a restart, and if it does *not* return on the 2.0.10
>>>>>> instance, but we don't have a clear timeline at the moment on this if it's
>>>>>> something that pops up "in a while" of bird running.
>>>>>>
>>>>>> On Thu, Mar 2, 2023 at 2:54 PM Hugo Slabbert <
>>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>>
>>>>>>> Was this perhaps 3f477ccb
>>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJwNzcEOgjAMANB_6RmpC5tN9zdrx7AJiBnlovHf5fZu7wtnXyHD0_19ZMTFfC0yvkxH_eDFA8V6xRvqvm3mOLVIpCr3aa7M2h6UopSiJRA3bcRBtJaUYIC-Qw4DuNUrCJEJfn_7GSJQ>
>>>>>>> ?
>>>>>>>
>>>>>>> Filters: Function body comparison result now used.
>>>>>>>> Function bodies were compared in post-parse time, yet the result
>>>>>>>> was not
>>>>>>>> used and the functions were incorrectly considered the same as
>>>>>>>> before.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Now the result is used to reload affected protocols.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 2, 2023 at 2:51 PM Hugo Slabbert <
>>>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>>>
>>>>>>>> ah, right, apologies.
>>>>>>>>
>>>>>>>> bird 2.0.7-4.1 on Debian 11.6, kernel  5.10.136-1
>>>>>>>>
>>>>>>>> Looks like 2.0.7 was released Oct 16 2019 (
>>>>>>>> https://bird.network.cz/?download
>>>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJyrViotylGyUsooKSkottLXT8osStHLSy0pzy_K1kuu0rdPyS_Py8lPTFHSUSrKV7Iy1FEqyUwBajA0sTRXqgUAsmAUQw>),
>>>>>>>> so a fair chance we might be hitting this? It looks like something from
>>>>>>>> 2.0.10 is available from the bullseye backports, with the most recent being
>>>>>>>> 2.0.12 in bookworm or sid. I'll look at pulling one of those in to validate.
>>>>>>>>
>>>>>>>> ...where changes in functions sometimes got ignored.
>>>>>>>>
>>>>>>>>
>>>>>>>> This might be reaching, but would that explain the difference
>>>>>>>> between what's shown in route export status output versus what's actually
>>>>>>>> being exported?
>>>>>>>>
>>>>>>>> On Thu, Mar 2, 2023 at 2:39 PM Maria Matejka via Bird-users <
>>>>>>>> bird-users at network.cz> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > We've tried adding a sleep between when the include snippet that
>>>>>>>>> changes
>>>>>>>>> > the DRAIN_NODE  value is written and when we hit `birdc
>>>>>>>>> configure`, but
>>>>>>>>> > that doesn't appear to make any difference. If we execute `birdc
>>>>>>>>> > configure` *twice*, though, everything's fine: The actual
>>>>>>>>> exports are
>>>>>>>>> > stopped. That's true without any sleep or break between running
>>>>>>>>> > configure as well; literally just `birdc configure` back to back
>>>>>>>>> in the
>>>>>>>>> > script that manages this.
>>>>>>>>> >
>>>>>>>>> > We do not see any indication of issues in the `birdc configure`
>>>>>>>>> runs or
>>>>>>>>> > in BIRD's logs.
>>>>>>>>>
>>>>>>>>> You are not disclosing the version of BIRD you are using. I
>>>>>>>>> vaguely
>>>>>>>>> remember that we fixed this kind of bug several years ago where
>>>>>>>>> changes
>>>>>>>>> in functions sometimes got ignored.
>>>>>>>>>
>>>>>>>>> Thus if you are not using a recent BIRD version, you are probably
>>>>>>>>> hitting that old bug.
>>>>>>>>>
>>>>>>>>> Maria
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20230315/0f607131/attachment.htm>


More information about the Bird-users mailing list