On Wed, Jun 6, 2018 at 1:59 PM, Thomás S. Bregolin <thoms3rd@gmail.com> wrote:
On Wed, Jun 6, 2018 at 1:33 PM, Thomás S. Bregolin <thoms3rd@gmail.com> wrote:
Hello,
On Tue, Jun 5, 2018 at 3:43 PM, Thomás S. Bregolin <thoms3rd@gmail.com> wrote:
On Tue, Jun 5, 2018 at 2:13 PM, Jan Maria Matejka <jan.matejka@nic.cz> wrote:
Could you please try the attached script or try to create some reproducer for me to see the bug clearly?
I will give the test-withdraw script a go and reply with more information.
I've attached a modified version of the test-withdraw script showing the issue. It seems my problem is related to the "start delay time" option. When I set it to 1, sometimes the withdrawal is sent, and sometimes it isn't. However, when it *isn't* sent, it is *never* sent, no matter how long I wait. The problem is solved by setting the timeout to a higher value, or using the default 5 seconds.
I am no longer sure if this is a bug or not. Please let me know if you would like more information about this.
P.S.: removing the time delay solves the issue in the test script, but not in my production environment. Seem I have some more debugging to do.
Some debug logs with a "start delay time" of 20 seconds: Jun 06 13:11:47 m03 bird[28196]: Reconfiguring Jun 06 13:11:47 m03 bird[28196]: Removing protocol e_route_20 Jun 06 13:11:47 m03 bird[28196]: kernel1: Reconfigured Jun 06 13:11:47 m03 bird[28196]: device1: Reconfigured Jun 06 13:11:47 m03 bird[28196]: bgpint: Reconfigured Jun 06 13:11:47 m03 bird[28196]: Reconfigured You can see the protocol is removed but the prefixes are not withdrawn; there are no "bird[28196]: cf_bgplan < removed ... via ... on ..." messages. - Thomás