BIRD continues exporting routes but reports no exports
Hugo Slabbert
hugo.slabbert at menlosecurity.com
Wed Mar 15 21:08:29 CET 2023
Closing the loop on this:
A bit over a week in, any of the boxes on 2.0.7 consistently reproduce the
issue (need `configure` issued twice to actually pick up the export policy
changes) whereas the test boxes on 2.0.10 are still consistently behaving
properly (single `configure` properly updates exports).
On Wed, Mar 8, 2023 at 2:20 PM Hugo Slabbert <
hugo.slabbert at menlosecurity.com> wrote:
> *nods*
>
> We generally run distro packages unless we have specific requirements for
> newer features or fixes etc.
>
> Current experiments show that the test routers on 2.0.7 started to
> reproduce this issue after about 3 days since BIRD's process start, whereas
> the test boxes we have on 2.0.10 have not yet reproduced the issue after
> about 3 days and 5 days, respectively.
>
> It's always tough to prove a negative, but we'll keep it running through
> the weekend to validate and look at bumping to 2.0.10 across the fleet if
> we're still clear on the 2.0.10 boxes at that point.
>
> On Wed, Mar 8, 2023 at 2:17 PM Ross Tajvar <ross at tajvar.io> wrote:
>
>> By the way - compiling the most recent version of bird is very easy. So
>> even though there's not a package for 2.0.12 for bullseye, I recommend just
>> compiling with the same options as the bird package and running that.
>>
>> On Fri, Mar 3, 2023, 2:07 PM Hugo Slabbert via Bird-users <
>> bird-users at network.cz> wrote:
>>
>>> Ah; thanks. Okay, I was misreading that as just referring to regular
>>> table filtering, not in conjunction with import/export. I had looked at `show
>>> symbols table` and not seen any indication of it, but missed that these
>>> are present in the `show route export table <p.c>` format regardless.
>>>
>>> Thanks. That confirms that we do in fact see a difference there between
>>> the export table and the ad hoc route export view when this occurs, after a
>>> single call to `birdc configure` (scrubbed slightly here):
>>>
>>> ```
>>> bird> show route export gw_085ea85_euwest2
>>> bird>
>>> bird> show route export table gw_085ea85_euwest2.ipv4
>>> Table export:
>>> 57.140.1.0/24
>>> <https://isolate-menlo.menlosecurity.com/0/eJyrViotylGyUsooKSmw0tc3NdczNDHQM9Qz0DcyUdJRKspXsjLUUSrJTAGqMTSxNFeqBQBU6gyn>
>>> unicast [<name of static source> 16:41:41.058] * (100)
>>> via <next hop> on bond0 onlink
>>> ```
>>>
>>> We'll keep an eye here and validate if we do see this returning on
>>> 2.0.10 as well, or if 2.0.10 remains clear.
>>>
>>> On Fri, Mar 3, 2023 at 10:18 AM Alexander Zubkov <green at qrator.net>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> It is documented in recent versions and on the bird's site too. Pay
>>>> attention to this:
>>>>
>>>> [(import|export) table p.c]
>>>>
>>>> On Fri, Mar 3, 2023, 18:32 Hugo Slabbert via Bird-users <
>>>> bird-users at network.cz> wrote:
>>>>
>>>>> Right, so,
>>>>>
>>>>> I've gone ahead and enabled export tables on the channels for the
>>>>> relevant peers, per Alexander's suggestion for possibly getting additional
>>>>> visibility. I don't seem to spot any different views for route status,
>>>>> though. I don't see any particular docs on how to *view* export
>>>>> tables; does enabling export tables make a different view available to look
>>>>> at the export table contents specifically? Or does it just shift the
>>>>> behaviour so `show route export <protocol>` displays the export table
>>>>> contents rather than a point-in-time evaluation of export filters for the
>>>>> specified neighbor?
>>>>>
>>>>> Snippet showing export tables config enabled for the peers:
>>>>>
>>>>> ```
>>>>> template bgp GATEWAY_v6 {
>>>>> hold time 6;
>>>>> startup hold time 20;
>>>>> connect delay time 3;
>>>>> connect retry time 6;
>>>>> error wait time 3, 12;
>>>>> med metric;
>>>>> allow local as 1;
>>>>>
>>>>> local fdff::4:2 as MENLO_ASN;
>>>>>
>>>>> ipv6 {
>>>>> export table on;
>>>>> import filter GATEWAY_IMPORT_v6;
>>>>> export filter GATEWAY_EXPORT_v6;
>>>>> };
>>>>> ipv4 {
>>>>> export table on;
>>>>> extended next hop on;
>>>>> add paths rx;
>>>>> import filter GATEWAY_IMPORT_v4;
>>>>> export filter GATEWAY_EXPORT_v4;
>>>>> };
>>>>> }
>>>>> # ...
>>>>> protocol bgp gw_085ea85_euwest2 from GATEWAY_v6 {
>>>>> neighbor fdff::8005:f4d1 as 65000;
>>>>> }
>>>>> ```
>>>>>
>>>>> I don't see any different behaviour on the affected hosts, though.
>>>>> E.g. a host that just had `configure` called once after setting the
>>>>> draining flag is showing these symptoms, showing nothing for `show
>>>>> route export <protocol>`:
>>>>>
>>>>> ```
>>>>> bird> show route export gw_085ea85_euwest2
>>>>> bird>
>>>>> ```
>>>>>
>>>>> ...but still showing exports under the protocol details:
>>>>>
>>>>> ```
>>>>> bird> show protocols all gw_085ea85_euwest2
>>>>> Name Proto Table State Since Info
>>>>> gw_085ea85_euwest2 BGP --- up 2023-03-03 16:33:43
>>>>> Established
>>>>> BGP state: Established
>>>>> # ...
>>>>> Channel ipv6
>>>>> State: UP
>>>>> Table: master6
>>>>> Preference: 100
>>>>> Input filter: GATEWAY_IMPORT_v6
>>>>> Output filter: GATEWAY_EXPORT_v6
>>>>> Routes: 2 imported, 2 exported, 1 preferred
>>>>> Route change stats: received rejected filtered ignored
>>>>> accepted
>>>>> Import updates: 3 0 1 0
>>>>> 2
>>>>> Import withdraws: 0 0 --- 0
>>>>> 0
>>>>> Export updates: 109 5 96 ---
>>>>> 8
>>>>> Export withdraws: 2 --- --- ---
>>>>> 2
>>>>> BGP Next hop: fdff::4:2
>>>>> Channel ipv4
>>>>> State: UP
>>>>> Table: master4
>>>>> Preference: 100
>>>>> Input filter: GATEWAY_IMPORT_v4
>>>>> Output filter: GATEWAY_EXPORT_v4
>>>>> Routes: 12 imported, 1 exported, 0 preferred
>>>>> Route change stats: received rejected filtered ignored
>>>>> accepted
>>>>> Import updates: 12 0 0 0
>>>>> 12
>>>>> Import withdraws: 0 0 --- 0
>>>>> 0
>>>>> Export updates: 39 4 31 ---
>>>>> 4
>>>>> Export withdraws: 0 --- --- ---
>>>>> 1
>>>>> BGP Next hop: fdff::4:2
>>>>> ```
>>>>>
>>>>> Note this is still on 2.0.7. We've bumped some hosts to 2.0.10, but
>>>>> as indicated in the previous message, just a simple restart clears this
>>>>> issue from occurring. We've enabled the export table config on both a
>>>>> 2.0.7 and a 2.0.10 host, to be able to possibly spot if this reoccurs on
>>>>> the 2.0.10 host as well after a period. An example host on 2.0.7 showing
>>>>> this behaviour has been up for ~2 weeks. The box upgraded to 2.0.10 has had
>>>>> BIRD running for just ~16 hours at this point and is not yet showing any
>>>>> issues.
>>>>>
>>>>> On Thu, Mar 2, 2023 at 4:07 PM Hugo Slabbert <
>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>
>>>>>> A slight update on this:
>>>>>>
>>>>>> 3f477ccb
>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJwNzcEOgjAMANB_6RmpC5tN9zdrx7AJiBnlovHf5fZu7wtnXyHD0_19ZMTFfC0yvkxH_eDFA8V6xRvqvm3mOLVIpCr3aa7M2h6UopSiJRA3bcRBtJaUYIC-Qw4DuNUrCJEJfn_7GSJQ>
>>>>>> does appear to be in 2.0.7, which we're running, so if that's the issue
>>>>>> that may not be the problem.
>>>>>>
>>>>>> This looked to be successful initially when upgrading to 2.0.10. But,
>>>>>> I then checked a box that was still running 2.0.7 and where we could repro
>>>>>> it. I simply restarted bird there, and then could no longer repro it.
>>>>>>
>>>>>> So, just restarting bird at 2.0.7 was sufficient to clear the
>>>>>> problem, at least temporarily, and the bump to 2.0.10 then wasn't a clear
>>>>>> test, given that's obviously a fresh instance of bird running.
>>>>>>
>>>>>> We'll try to validate if the problem eventually returns on the 2.0.7
>>>>>> box(es) after a restart, and if it does *not* return on the 2.0.10
>>>>>> instance, but we don't have a clear timeline at the moment on this if it's
>>>>>> something that pops up "in a while" of bird running.
>>>>>>
>>>>>> On Thu, Mar 2, 2023 at 2:54 PM Hugo Slabbert <
>>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>>
>>>>>>> Was this perhaps 3f477ccb
>>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJwNzcEOgjAMANB_6RmpC5tN9zdrx7AJiBnlovHf5fZu7wtnXyHD0_19ZMTFfC0yvkxH_eDFA8V6xRvqvm3mOLVIpCr3aa7M2h6UopSiJRA3bcRBtJaUYIC-Qw4DuNUrCJEJfn_7GSJQ>
>>>>>>> ?
>>>>>>>
>>>>>>> Filters: Function body comparison result now used.
>>>>>>>> Function bodies were compared in post-parse time, yet the result
>>>>>>>> was not
>>>>>>>> used and the functions were incorrectly considered the same as
>>>>>>>> before.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Now the result is used to reload affected protocols.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 2, 2023 at 2:51 PM Hugo Slabbert <
>>>>>>> hugo.slabbert at menlosecurity.com> wrote:
>>>>>>>
>>>>>>>> ah, right, apologies.
>>>>>>>>
>>>>>>>> bird 2.0.7-4.1 on Debian 11.6, kernel 5.10.136-1
>>>>>>>>
>>>>>>>> Looks like 2.0.7 was released Oct 16 2019 (
>>>>>>>> https://bird.network.cz/?download
>>>>>>>> <https://isolate-menlo.menlosecurity.com/0/eJyrViotylGyUsooKSkottLXT8osStHLSy0pzy_K1kuu0rdPyS_Py8lPTFHSUSrKV7Iy1FEqyUwBajA0sTRXqgUAsmAUQw>),
>>>>>>>> so a fair chance we might be hitting this? It looks like something from
>>>>>>>> 2.0.10 is available from the bullseye backports, with the most recent being
>>>>>>>> 2.0.12 in bookworm or sid. I'll look at pulling one of those in to validate.
>>>>>>>>
>>>>>>>> ...where changes in functions sometimes got ignored.
>>>>>>>>
>>>>>>>>
>>>>>>>> This might be reaching, but would that explain the difference
>>>>>>>> between what's shown in route export status output versus what's actually
>>>>>>>> being exported?
>>>>>>>>
>>>>>>>> On Thu, Mar 2, 2023 at 2:39 PM Maria Matejka via Bird-users <
>>>>>>>> bird-users at network.cz> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > We've tried adding a sleep between when the include snippet that
>>>>>>>>> changes
>>>>>>>>> > the DRAIN_NODE value is written and when we hit `birdc
>>>>>>>>> configure`, but
>>>>>>>>> > that doesn't appear to make any difference. If we execute `birdc
>>>>>>>>> > configure` *twice*, though, everything's fine: The actual
>>>>>>>>> exports are
>>>>>>>>> > stopped. That's true without any sleep or break between running
>>>>>>>>> > configure as well; literally just `birdc configure` back to back
>>>>>>>>> in the
>>>>>>>>> > script that manages this.
>>>>>>>>> >
>>>>>>>>> > We do not see any indication of issues in the `birdc configure`
>>>>>>>>> runs or
>>>>>>>>> > in BIRD's logs.
>>>>>>>>>
>>>>>>>>> You are not disclosing the version of BIRD you are using. I
>>>>>>>>> vaguely
>>>>>>>>> remember that we fixed this kind of bug several years ago where
>>>>>>>>> changes
>>>>>>>>> in functions sometimes got ignored.
>>>>>>>>>
>>>>>>>>> Thus if you are not using a recent BIRD version, you are probably
>>>>>>>>> hitting that old bug.
>>>>>>>>>
>>>>>>>>> Maria
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20230315/0f607131/attachment.htm>
More information about the Bird-users
mailing list