generate default route and export to kernel if remote peer is up

Nikola Mitev

7 Sep 2018 7 Sep '18

8:34 a.m.

Hi, I have a setup of ISP1 -- R1 -- LAN -- R2 -- ISP2 with BGP peerings from R1 to ISP1 and R2 to ISP2 Some hosts on the LAN have R1 as primary gateway, others R2 to distribute the load between the ISPs. I want to add a default route to the kernel on each router but only if the remote peer is up. The remote peer does not respond to BFD so that's not an option. Once both routers have a conditionally defined default route for their ISP, it should be easy to propagate that to each other with increased metric as a backup route. I am assuming here that if R1 has a default through R2 and ISP1 is down, R1 will respond with an ICMP redirect to any requests from hosts that need to be routed out of LAN. I searched for a recipe that would fit the above but found nothing yet, hoping someone here can help :) Nik

Show replies by date

Grant Taylor

7 Sep 7 Sep

3:44 p.m.

On 09/07/2018 02:34 AM, Nikola Mitev wrote:

...

Hi,

Hi,

...

I have a setup of ISP1 -- R1 -- LAN -- R2 -- ISP2 with BGP peerings from R1 to ISP1 and R2 to ISP2

Are your BGP neighbors advertising a default route to you? I would think that R1 and R2 would iBGP neighbors (or similar with other protocols) with each other. Thus they would both re-advertise the default that each receives to the other. This has the added benefit of R2 learning prefixes that are close to ISP1 and routing out that way instead of going out ISP2 and around the Internet to get back to prefixes close to ISP1.

...

Some hosts on the LAN have R1 as primary gateway, others R2 to distribute the load between the ISPs.

Okay. I'd think seriously about VRRP or ideally GLBP for this. It's my understanding that Gateway Load Balancing Protocol can allow all GLBP members to be active and share load where as VRRP will have one active member. — You can have two VRRP ""routers and divide clients between them that way.

...

I want to add a default route to the kernel on each router but only if the remote peer is up. The remote peer does not respond to BFD so that's not an option.

I've been wanting a solution to this problem for about 20 years. Specifically I want to be able to detect if the static default gateway is functioning or not and dynamically alter the local routing tables. — I've not found a solution for this yet. (Granted, I've not spent a lot of time trying to find one.) I had hoped that BFD would do this, but that apparently requires active support from the remote neighbor. This can get complicated if the local link doesn't go down when the remote neighbor is not reachable. I.e.: [router]---[switch]-X-[bridging DSL modem]-X-[ISP router] The Ethernet between the router and the switch is up/up, but the link on either side of the modem is down. The only way that I've contemplated solving this is to watch traffic coming back from the Internet via the ISP's router, and dynamically modify the local routing tables. I can see this as a simple test of is anything coming in from the ISP -or- something beyond the ISP's router. Can this be extended to watch routes to / from specific destinations (via the gateway)? Should this be done? Seeing as how I haven't found an answer for this problem, I'd strongly encourage you to try to get your BGP neighbors to advertise a default route over the existing BGP neighbor sessions.

...

Once both routers have a conditionally defined default route for their ISP, it should be easy to propagate that to each other with increased metric as a backup route. I am assuming here that if R1 has a default through R2 and ISP1 is down, R1 will respond with an ICMP redirect to any requests from hosts that need to be routed out of LAN.

This sounds reasonable to me. There are obvious issues of IP addressing and possibly NAT if you're not advertising globally routed IP address space for the LAN. Even then, outbound connections and associated incoming replies should be okay. Granted, you may loose state when connections switch from one router NAT set to the other.

...

I searched for a recipe that would fit the above but found nothing yet, hoping someone here can help :)

I'd love to see a suggestion from someone too. -- Grant. . . . unix || die

Nikola Mitev

8 Sep 8 Sep

10:03 a.m.

On Fri, 2018-09-07 at 09:44 -0600, Grant Taylor wrote:

...

...
I have a setup of ISP1 -- R1 -- LAN -- R2 -- ISP2 with BGP peerings from R1 to ISP1 and R2 to ISP2 Are your BGP neighbors advertising a default route to you?

Unfortunately no. I am creating the second peering now, the one which is live is through a Hurricane Electric 6in4 tunnel - it is a free service and I am not sure how much I can ask of them.

...

I would think that R1 and R2 would iBGP neighbors (or similar with other protocols) with each other. Thus they would both re-advertise the default that each receives to the other.

This has the added benefit of R2 learning prefixes that are close to ISP1 and routing out that way instead of going out ISP2 and around the Internet to get back to prefixes close to ISP1.

My only concern here is adding the entire BGP routing table to the kernel table - would that be safe to do and easy enough to work with? Is there a better way that I'm missing?

...

...
Some hosts on the LAN have R1 as primary gateway, others R2 to distribute the load between the ISPs. Okay.

I'd think seriously about VRRP or ideally GLBP for this.

You are right - as it happens I am already redesigning that part. Multiple default gateways distributed with DHCP seemed like an simple solution but it doesn't work for me - not sure what needs to happen for a host to actually make use of the secondary gateway.

...

It's my understanding that Gateway Load Balancing Protocol can allow all GLBP members to be active and share load where as VRRP will have one active member. — You can have two VRRP ""routers and divide clients between them that way.

It will have to be VRRP since the routers are both PC Engines APU boards running Debian.

...

...
I want to add a default route to the kernel on each router but only if the remote peer is up. The remote peer does not respond to BFD so that's not an option. I've been wanting a solution to this problem for about 20 years.

Specifically I want to be able to detect if the static default gateway is functioning or not and dynamically alter the local routing tables. — I've not found a solution for this yet. (Granted, I've not spent a lot of time trying to find one.)

A negative answer is still a good answer :) Should be able to script some pinging/BGP connection state tracking solution. Just wanted to be sure I'm not reinventing the wheel as it will no doubt take some time to test & get it right.

...

I had hoped that BFD would do this, but that apparently requires active support from the remote neighbor.

Yep.

...

This can get complicated if the local link doesn't go down when the remote neighbor is not reachable. I.e.:

[router]---[switch]-X-[bridging DSL modem]-X-[ISP router]

The Ethernet between the router and the switch is up/up, but the link on either side of the modem is down.

In my case it's a 6in4 tunnel which should go down if the remote goes but I am yet to find out in what ways may the remote fail. Seems perfectly possible the tunnel remains up but the BGP session breaks etc. The BGP session breaking doesn't mean outbound routing is broken but is likely to cause some asymmetric routing as the replies start coming back through the other ISP.

...

The only way that I've contemplated solving this is to watch traffic coming back from the Internet via the ISP's router, and dynamically modify the local routing tables.

If you mean BGP session traffic on TCP/179 that could work, otherwise my link might not be busy enough at times.

...

I can see this as a simple test of is anything coming in from the ISP -or- something beyond the ISP's router.

Can this be extended to watch routes to / from specific destinations (via the gateway)? Should this be done?

It's easy on a linux box. I'm thinking track the BGP session with e.g. 'ss -npt state established | grep :179 |wc -l' say every second and then doing a ping every 5s or so. It will need up/down wait timers tuned.

...

Seeing as how I haven't found an answer for this problem, I'd strongly encourage you to try to get your BGP neighbors to advertise a default route over the existing BGP neighbor sessions.

I'll ask and see how far I get :)

...

...
I searched for a recipe that would fit the above but found nothing yet, hoping someone here can help :) I'd love to see a suggestion from someone too.

Thanks for your reply.

Grant Taylor

4:11 p.m.

On 09/08/2018 04:03 AM, Nikola Mitev wrote:

...

Unfortunately no. I am creating the second peering now, the one which is live is through a Hurricane Electric 6in4 tunnel - it is a free service and I am not sure how much I can ask of them.

Okay. I have found it's often worth while to politely ask. Sometimes people will do what you ask, or tell you how to accomplish what you want with services that they do offer.

...

My only concern here is adding the entire BGP routing table to the kernel table - would that be safe to do and easy enough to work with? Is there a better way that I'm missing?

It's my understanding that the BGP table(s) (RIB(s)) is (are) inside of BIRD and not the actual kernel. BIRD will then take some routes from the multiple RIB(S) that it's processing and put them into the kernel's routing table (FIB(?)). As such, it's perfectly safe to have multiple BGP connections (RIBs) in BIRD. The only potential gotcha is memory. (I have no idea how much memory is needed, it's just the only potential limitation that I can think of.)

...

You are right - as it happens I am already redesigning that part. Multiple default gateways distributed with DHCP seemed like an simple solution but it doesn't work for me - not sure what needs to happen for a host to actually make use of the secondary gateway.

I think relying on the hosts to do some sort of load balancing or some having affinity to one gateway and others having affinity to the other gateway is going to be very prone to failure. That's where GLBP comes into play. GLBP enables the routers to appear as a large virtual router that utilizes multiple routers. I think there's some sort of loose proportion of what traffic goes to what router, possibly based on tuneable settings. All clients think they have the single gateway. I think the ""magic is done at the MAC to IP layer. I.e. some clients think the router's MAC is aa:aa:aa:aa:aa:aa and others think it's bb:bb:bb:bb:bb:bb. GLBP also allows one router to act as both if the other router is offline. Conversely, both HSRP and VRRP both function with a single /active/ device. I don't know the current state of GLBP on non-Cisco equipment. But that's the route that I would try to go.

...

It will have to be VRRP since the routers are both PC Engines APU boards running Debian.

I don't know if VRRP is your only option or not. Do research on first hop redundancy protocols (FHRP) and what's supported under Linux.

...

A negative answer is still a good answer :) Should be able to script some pinging/BGP connection state tracking solution. Just wanted to be sure I'm not reinventing the wheel as it will no doubt take some time to test & get it right.

Agreed. If your luck is anything like mine, the day to day steady state operation will be easy to achieve. The problem will come with partial / total failures of something, or a weird combination of partials on either side. I would still suggest some sort of dynamic routing protocol between R1 and R2. That way they both know what the other can get to, including a default gateway if they have one. This will allow you to remove the local gateway in the event that the connection to the directly attached ISP is inaccessible, and the other to learn about said problem. This is important because if the other router also has a problem (say Backhoe Bob took out data to both ISPs) so that they don't ping pong between each other thinking that the other has a route to the internet. If I were to try to script something like this today, I'd do it with a few timers. The first being when the last outgoing traffic was sent and the second being when the last incoming traffic was received. As long as the second (incoming) timer is lower than first (outgoing) timer, I think it's safe to say the connection to the ISP's router is functional. In the event that the second (incoming) timer is higher than the first (outgoing) timer, I'd start a third (dead gateway) timer. If the third (dead gateway) timer ever reaches zero, then I'd know that there is a problem with the local ISP and I'd withdraw the local default gateway. Fortunately, BIRD has the ability to monitor kernel routing table changes (like withdrawing a local default gateway) and update things accordingly. This also means that the other router will learn the status change. I think you would then have a choice, either withdraw the local router from the FHRP -or- stay in and use the other router (and it's default gateway) as the route out. This is also why it's important for both routers to have an idea of each others state (at least for the default gateway). You want both routers to be able to return a no route to host message quickly if there is no functional default gateway.

...

In my case it's a 6in4 tunnel which should go down if the remote goes but I am yet to find out in what ways may the remote fail. Seems perfectly possible the tunnel remains up but the BGP session breaks etc. The BGP session breaking doesn't mean outbound routing is broken but is likely to cause some asymmetric routing as the replies start coming back through the other ISP.

You can likely simulate a failure by adding a {bad,discard,reject} route for the IPv4 address of the remote 6in4 tunnel endpoint, thus breaking the tunnel in a controlled manner for testing. You could do similar for the remote BGP endpoint too, for similar reasons. I don't recall if you're using your own provider independent globally routed IPs or not. If you're not, you will be in a strange situation where the IPs that were going out provider 1 will likely not work to come back from provider 2. In some ways, NAT does make this a little bit better as it does provide a clean delineation of where IPs are used. It also helps avoid the issue of provider aggregate IPs.

...

If you mean BGP session traffic on TCP/179 that could work, otherwise my link might not be busy enough at times.

I wasn't thinking BGP traffic per say. I was thinking any traffic coming in from the ISP. This is also why you need the separate timers for incoming and outgoing traffic, to find the delta between them. (The third timer is to detect if the errant state is going on for too long.)

...

It's easy on a linux box. I'm thinking track the BGP session with e.g. 'ss -npt state established | grep :179 |wc -l' say every second and then doing a ping every 5s or so. It will need up/down wait timers tuned.

Such does work, particularly for humans. IMHO that doesn't work as well for automation to monitor things. I would likely configure a couple of IPTables rules to send (select traffic) to a user space process via NETLINK (there might be a different / better method now). That way you don't need to rely on scraping kernel status tables or the overhead of sniffing traffic. The user space process would manage the counters and dynamically add / remove the configured default gateway to / from the kernel routing table.

...

I'll ask and see how far I get :)

Fair enough. I would ask for the following three categories of routes: · default gateway · provider routes · provider customer routes I highly doubt that you would get, much less want to process, a full default free zone feed from one, much less two providers.

...

Thanks for your reply.

You're welcome. Please keep me (us?) in the loop. I'm curious to learn how things turn out. -- Grant. . . . unix || die

Grant Taylor

4:28 p.m.

On 09/08/2018 10:11 AM, Grant Taylor wrote:

...

If I were to try to script something like this today, I'd do it with a few timers. The first being when the last outgoing traffic was sent and the second being when the last incoming traffic was received. As long as the second (incoming) timer is lower than first (outgoing) timer, I think it's safe to say the connection to the ISP's router is functional.

In the event that the second (incoming) timer is higher than the first (outgoing) timer, I'd start a third (dead gateway) timer. If the third (dead gateway) timer ever reaches zero, then I'd know that there is a problem with the local ISP and I'd withdraw the local default gateway.

Now my brain is chewing on this. What I've outlined will detect the transition from normal / steady state to errant state. But as it's written, it will never detect that the local ISP connection is usable because there is no traffic to monitor. As such, I'd likely have a separate routing table with only the ISP's connection and the associated default gateway. That way it's possible to send probe traffic (even when the main routing table has a different default gateway) to detect when the local ISP's connection is usable again. [1] If / when the local ISP's connection is usable, add their default gateway to the main local routing table and allow BIRD to do it's thing. [1] You need to decide what to do with established connections; do you bring them back to the local ISP, thus possibly breaking session state, or do you rely on route caching to ""gracefully bring things back. Note: I have never gotten Dead Gateway Detection to do what I want in any reliable manner. DGD tends to rely on link state and / or special kernel parameters [2]. Even when it does function, I've found that it does not do what I want it to do. [2] I think you have to tell the kernel to hold onto unreachable routes -and- you need to have probe traffic to kick the kernel to realize that the gateway is reachable again. -- Grant. . . . unix || die

2879

Age (days ago)

2880

Last active (days ago)

List overview

Download

4 comments

2 participants

participants (2)

Grant Taylor
Nikola Mitev