<div dir="ltr"><div dir="ltr"><div><div>Hi,</div><div>from an architectural point of view this patch is one of many that will help us migrating from an iBGP-only datacenter to an eBGP-only datacenter so the idea is to have a feature that is as simple as possible with almost no implications that can be turned on during the migration and then off after it. A confederation would have a similar effect but would have other side effects and would also require to change things in other places, leading to a maintenance window we are actually trying to avoid with these set of patches.</div><div><br></div><div>I am going to go a bit more in detail in a talk at NANOG71 in a couple of weeks, but let me try to summarize it here.</div><div><br></div><div>The patches in question are:</div><div><br></div><div><a href="http://bird.network.cz/pipermail/bird-users/2017-March/011084.html">http://bird.network.cz/pipermail/bird-users/2017-March/011084.html</a></div><div><a href="http://bird.network.cz/pipermail/bird-users/2017-February/010925.html">http://bird.network.cz/pipermail/bird-users/2017-February/010925.html</a></div><div><a href="http://bird.network.cz/pipermail/bird-users/2017-August/011468.html">http://bird.network.cz/pipermail/bird-users/2017-August/011468.html</a> </div><div><a href="http://bird.network.cz/pipermail/bird-users/2017-August/011469.html">http://bird.network.cz/pipermail/bird-users/2017-August/011469.html</a></div><div><br></div><div>By combining the patches above we are able to migrate a full POP from iBGP with route reflectors to an eBGP-only setup without intervention and without having to drain POPs:</div><div><br></div><div>1. The patch `Secondary remote AS support for protocol BGP ` let's us configure the devices ahead of time to accept OPEN messages from different AS's so we don't have to bother synch'ing changes.</div><div>2. The `Allow exchanging LOCAL_PREF with eBGP peers` patch allows us to keep doing operations as we do now regardless of if the POP is formed by eBGP or iBGP speakers.</div><div>3. And for the last two patches let's look at this scenario:</div><div><br></div><div><font face="monospace">                /---- spine00 (AS65000)</font></div><div><font face="monospace">leaf00 (AS12345)</font></div><div><font face="monospace">                \---- spine01 (AS12345)</font></div><div><font face="monospace"><br></font></div><div>Prefix A coming from the spines:</div><div><br></div><div><font face="monospace">from spine00:</font></div><div><font face="monospace"><span style="white-space:pre">      </span>protocol: eBGP</font></div><div><font face="monospace"><span style="white-space:pre">    </span>AS PATH:  65000 $AS_PATH </font></div><div><font face="monospace"><br></font></div><div><font face="monospace">from spine01:</font></div><div><font face="monospace">        protocol: iBGP</font></div><div><font face="monospace"><span style="white-space:pre">     </span>AS PATH:  $AS_PATH</font></div><div><br></div><div>a. The patch "Implement iebgp_peer mode" helps us with the eBGP vs iBGP comparison. It doesn't try to do much else.</div><div>b. Now that eBGP vs iBGP is no longer a concern for this set of peers we need to fix the AS_PATH or the one from spine01 will be shorter and, this, win the election. This is where the patch "Implement skip_private_as_path_prefix" comes in. Instead of manipulating the iBGP speaker to prepend some AS we would rather skip the leading private AS's in the eBGP speaker. The reason is that as we have many spines so you would have to synch the changes so you wouldn't send all your traffic via only one (or alternatively schedule a maintenance which we are trying to avoid).</div><div><br></div><div>So now with all the features in place we can:</div><div><br></div><div>1. Ahead of time, enable all the features; local-pref over eBGP, have current and future AS configured, skip leading private AS's and compare eBGP as iBGP in the corresponding sessions.</div><div>2. During the maintenance change "local as" to the new private AS, switch by switch, causing only a small BGP session flap with no other implication.</div><div><br></div><div>And this is how we are migrating from iBGP to eBGP without human intervention and without having to drain POPs. Once migration is complete we can remove "compare_as_ibgp" flag.</div><div><br></div><div>In summary, I am not sure how useful some of these features could be in a permanent design, it might allow people to build `confederation-like` setups that are a bit simpler than using confederations but our goal is to have a path to migrate POPs autonomously without having to put them in maintenance mode. I understand these features are quite "temporary" and you might be tempted to dismiss them due to their nature but I think they are quite useful in production scenarios as they can simplify operations dramatically.</div><div><br></div><div>Regards.</div></div><div>David Barroso</div><div><br></div><div><br></div><div class="gmail_quote"><div dir="ltr">On Tue, 19 Sep 2017 at 15:08 Ondrej Zajicek <<a href="mailto:santiago@crfreenet.org" target="_blank">santiago@crfreenet.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Sat, Aug 12, 2017 at 10:22:06PM +0300, Lennert Buytenhek wrote:<br>

> Implement an 'internal eBGP' peer mode, where the remote peer uses a<br>

> different AS number than we do, as if it were an eBGP peer, but we<br>

> treat the peer as if were an AS-internal peer.  This enables<br>

> implementing a network setup according to the model documented in<br>

> RFC7938.  This makes two changes to the BGP route propagation logic:<br>

><br>

> * When we are propagating a route to or from an internal eBGP peer,<br>

>   we will avoid resetting the MED attribute.<br>

><br>

> * When comparing BGP-learned routes, we will consider routes learned<br>

>   from an 'internal eBGP' peer as iBGP routes as far as the RFC 4271<br>

>   9.1.2.2. d) check (Prefer external peers) is concerned.<br>

<br>

Hi<br>

<br>

I am inclined to integrate it (perhaps with some tweaks), but i wonder if<br>

it has some advantages compared to standardized solutions, namely BGP<br>

confederations and BGP route reflectors. It seems to me like a third way<br>

to do the same. Analogy with BGP confederations is obvious, BGP route<br>

reflectors are usually used in different way, but could be configured to<br>

work analogically (every router as RR, every IBGP link as mutual RR<br>

client). So why another approach?<br>

<br>

Also, it seems to me that it handles BGP_MED, but does not change<br>

behavior for BGP_NEXT_HOP nor BGP_LOCAL_PREF. Why?<br>

<br>

As BGP in 1.6.x branch and 2.0 branch diverged significantly, i am<br>

inclined to add the feature just to 2.0 branch, to avoid double work<br>

and reuse BGP confederation code that does essentially the same.<br>

<br>

--<br>

Elen sila lumenn' omentielvo<br>

<br>

Ondrej 'Santiago' Zajicek (email: <a href="mailto:santiago@crfreenet.org" target="_blank">santiago@crfreenet.org</a>)<br>

OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, <a href="http://wwwkeys.pgp.net" rel="noreferrer" target="_blank">wwwkeys.pgp.net</a>)<br>

"To err is human -- to blame it on a computer is even more so."<br><br>

</blockquote></div></div></div><div dir="ltr">-- <br></div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div style="color:rgb(0,0,0);font-family:sans-serif;font-size:small;line-height:19.5px">David Barroso</div><div style="color:rgb(0,0,0);font-family:sans-serif;font-size:small;line-height:19.5px"><a href="https://www.linkedin.com/in/dbarrosop">linkedin</a><span class="inbox-inbox-inbox-Apple-converted-space"> </span>|<span class="inbox-inbox-inbox-Apple-converted-space"> </span><a href="https://twitter.com/dbarrosop">twitter</a><span class="inbox-inbox-inbox-Apple-converted-space"> </span>|<span class="inbox-inbox-inbox-Apple-converted-space"> </span><a href="https://github.com/dbarrosop">github</a></div></div></div>