[Firehol-support] better understanding link-balancer and PBR

Wed Dec 7 19:16:34 GMT 2016

Again in your text.

On Wed, Dec 7, 2016 at 8:14 PM, Spike <spike at drba.org> wrote:

> Thanks for your thoughts Costa, this kind of readily available insight and
> help along with great SW makes me really happy I chose firehol.
>
> inline below (most of my considerations come from reading
> http://linux-ip.net/html/routing-selection.html#routing-selection-adv):
>
> On Wed, Dec 7, 2016 at 3:12 AM Tsaousis, Costa <costa at tsaousis.gr> wrote:
>
>
>> It simplifies routing significantly. Without this inheritance, policy
>> based routing would be a lot more complicated. Imagine it. You have your
>> static routes and 2 upstream providers. How would you say that lan server1
>> is to be routed via ISP2, without losing your static routes?
>>
>
> I'm not sure I see the point still. If the rules were not copied over they
> would still exist in main. Since PBR follows the rules in priority order
> and continues to the next rule if a match is not found, with l-b's default
> behavior, a static route found in main would win over the default route
> pointing to nexthop and the other table, no?
>

Let's see an example: assume you have 2 DMZ: 10.0.1.0/24 and 10.0.2.0/24.
You normally have 2 static routes for them in table main. Next, you have 2
ISPs: A, B.

Link-Balancer allows you to say: "I want server 10.0.1.2 to be routed via
ISP B". This will create a policy rule that forces server 10.0.1.2 to use
routing table B. But then, server 10.0.1.2 needs to communicate with server
10.0.2.2 at the other DMZ.

If link-balancer did not copy the static routes of the routing table, when
server 10.0.1.2 attempts to talk with server 10.0.2.2 the packets would
have been forwarded to your ISP B. You would need to add another policy
rule, to say "I want server 10.0.1.2 to communicate to 10.0.2.0/24 via
table main" to prevent this.

So, link-balancer copies the static routes, to provide a seamless
experience for your static routes. You only need to care about the default
gateway in policy based routing. All static routes just work.

Q2) l-b generates a nexthop default route using the GWs I configured as
>> default . When the packet encounters that do they go back to look at the
>> rules and then match Table1 for GW1 or Table2 for GW2 depending on nexthop
>> selected? If not, then what are those tables set up for? the main table
>> would already know how to reach those destinations since they are local.
>>
>>
>> This is done with policy based routing. Check: ip rule show or the
>> policy section in link-balancer.conf
>>
>
> maybe I didn't ask this clearly, lemme try again. I'm wondering if when
> the kernel chooses the default nexthop route in main that triggers another
> pass of the rules or not. Does that make more sense?
>

no. policy based routing (ip rules) is applied prior to the routing
decision. The routing decision is the routing table itself. So a packet is
to be forwarded, then the policy rules are consulted to decide the routing
table (this is called policy based routing), then a lookup is made on the
routing table to decide where to send the packet to.

Q3) my understanding is that routes are cached, so even after a link has
>> gone down a client will still make the same choice in terms of routing a
>> certain ip. Is that correct? ie it won't look at the rule or tables and
>> just pick the cached route. So for example if when 2 GWs were up, and
>> packets were routed through GW1, with Table1 having GW1 as its default
>> route, and then GW1 went down, subsequent packets would still route
>> through
>> GW1 until the cached route expired. Is that correct? If that's true, then
>> what's the point of changing the default route in Table1 to use GW2 when
>> the rule that pointed to GW1 is removed anyway?
>>
>>
>> hm... I don't know how the routing cache works exactly. I know however,
>> that in all cases I have encountered so far, my problem was only the
>> iptables connection tracker, especially when NAT is involved or CONNMARK is
>> used.
>> I had to to run conntrack to delete all the rules of the failed gateway,
>> to prevent long timeouts.
>>
>
> ah, interesting point. I found this on route caching which was a good read
> even tho some of the info is deprecated in newer kernels:
>
> https://vincent.bernat.im/en/blog/2011-ipv4-route-cache-linux.html
>
>
>
>> This ping-pong case is common if the check depends on the presence or not
>> of routes.
>>
>
> oh, good to know, but I honestly don't see why. If I have 2 GWs and one
> fail, why would the detection ping-pong between FAILED and OK? it seems it
> should stay failed, no?
>
> I found this in the docs for L-B:
>
> *Link Balancer will automatically either use a fallback gateway or copy
> the default-gateway of the origin table to the new table, so that traffic
> will continue to be served by the routing table that all its gateways went
> down. Of course, when the interface is restored, Link Balancer will restore
> the proper default gateway for this interface.*
>
> this seems to be the problem with the ping-pong to me: if GW1 failed and
> Table1's default GW(GW1) is replaced by GW2, then obviously the next run of
> L-B would succeed, no? being a default route even if the ping selects the
> source address of the dead GW it'd still go through. Am I misunderstanding
> something?
>

The whole idea is to keep policy based routing as stable as possible. So,
link-balancer will never change policy based rules based on the state of
the gateways. The only change when gateways change state are the default
gateway of one or more routing tables. It does this optimally. For example,
if you had PPPoE connections, when a PPPoE link goes down, only the default
gateway of the tables this gateway is used, is affected (the kernel removes
the default gateway automatically by itself). So, link-balancer just
restores a gateway on these tables. Nothing more.

The ping-pong is a logical problem. You are using servers on the internet
to test the gateways. Just make sure these servers are only accessed via
the same gateway, independently of the state of the gateways. If you
achieve this, there will no ping-pong. This is also why the RAS of your
ISPs are probably a better choice. If you have multiple ISPs, you will most
probably not be able to ping the RAS of ISP A via ISP B.

Costa