kernel: BIRD kernel syncer protocol issues on Linux
Helo, BIRD developers! I have found few issues with kernel syncer on Linux with enabled route learning. There are few problems in sysdep/linux/rtnetlink.c with netlink code: * Use after free when accessing nl_table_map[] in nl_parse_route(). * Possible race condition between rx buffer allocation nl_async_rx_buffer in for async socket and nl_open_async() and it's usage in nl_async_hook(). * Socket descriptor leakage in error path in nl_open_async() when bind(2) call fails. These problems addressed with patch bird-1.3.11-fix-shutdow~ath-in-rtnetlink.patch Additionaly, patch bird-1.3.11-fix-building-with-DBG-on.patch fixes building when -DGLOBAL_DEBUG or -DLOCAL_DEBUG enabled (enables building lots of debugging code available with DBG() macro). Information to reproduce problem with use after free in nl_parse_route(). ========================================== BIRD configuration ------------------------ # Configure logging log stderr all; # More chances to trigger crash. log syslog all; router id 172.16.1.1; protocol device devices { scan time 120; } protocol kernel kernel10 { debug all; persist no; scan time 120; learn yes; device routes no; kernel table 10; import all; export none; } Start BIRD, compiled with "-DGLOBAL_DEBUG" in debugging mode ------------------------------------------------------------------------------------ # bird -d olock: init Parsing configuration file `/etc/bird.conf' Protocol preconfig: Direct Static Pipe BGP Kernel Device Protocol postconfig: devices kernel10 sysdep_commit global_commit rt_commit rt_commit: master: created Allocating FIB hash of order 10: 1024 entries, 0 low, 4096 high done protos_commit protos_commit: 26-09-2013 10:57:45 <TRACE> kernel10: Initializing done Protocol start ... Kicking kernel10 up 26-09-2013 10:57:45 <TRACE> kernel10: Starting Allocating FIB hash of order 10: 1024 entries, 0 low, 4096 high kernel10 reporting state transition HUNGRY/DOWN -> */UP kernel10: Scheduling meal Connecting protocol kernel10 to table master 26-09-2013 10:57:45 <TRACE> kernel10: Connected to table master 26-09-2013 10:57:45 <TRACE> kernel10: State changed to feed do_commit finished with 0 obstacles remaining 26-09-2013 10:57:45 <INFO> Started Entering I/O loop. Feeding protocol devices Feeding protocol devices continued Announcing routes to new protocol devices Protocol devices up and running Feeding protocol kernel10 Feeding protocol kernel10 continued Announcing routes to new protocol kernel10 26-09-2013 10:57:45 <TRACE> kernel10: State changed to up Protocol kernel10 up and running KRT: Scanning routing table 26-09-2013 10:57:45 <TRACE> kernel10: Scanning routing table KRT: Got 192.168.10.0/24, type=6, oif=-1, table=10, prid=3, proto=kernel10 26-09-2013 10:57:45 <TRACE> kernel10: 192.168.10.0/24: [alien] created KRT: Got 192.168.20.0/24, type=6, oif=-1, table=20, prid=3, proto=(none) ... KRT: Pruning table master 26-09-2013 10:57:45 <TRACE> kernel10: Pruning table master KRT: Pruning inherited routes 26-09-2013 10:57:45 <TRACE> kernel10: Pruning inherited routes 192.168.10.0/24: announcing (metric=0) 26-09-2013 10:57:45 <TRACE> kernel10 > added [best] 192.168.10.0/24 blackhole 26-09-2013 10:57:45 <TRACE> kernel10 < rejected by protocol 192.168.10.0/24 blackhole After startup BIRD learns route from KRT 10 -------------------------------------------------------- # birdc 'show route table master all' BIRD 1.3.11 ready. 192.168.10.0/24 blackhole [kernel10 10:57] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 Change KRT ID to 10 ("kernel table 10" configuration option) and tell bird about configuration changes ---------------------------------------------------- # birdc configure BIRD 1.3.11 ready. Reading configuration from /etc/bird.conf Reconfiguration in progress Debug output after reconfiguration --------------------------------------------- Parsing configuration file `/etc/bird.conf' Protocol preconfig: Direct Static Pipe BGP Kernel Device Protocol postconfig: devices kernel10 26-09-2013 10:57:54 <INFO> Reconfiguring sysdep_commit global_commit rt_commit rt_commit: master: same done protos_commit protos_commit: devices: same 26-09-2013 10:57:54 <INFO> Restarting protocol kernel10 +++ adding obstacle 1 Kicking kernel10 down 26-09-2013 10:57:54 <TRACE> kernel10: Shutting down KRT: Flushing kernel routes 26-09-2013 10:57:54 <TRACE> kernel10: Flushing kernel routes kernel10 reporting state transition HAPPY/UP -> */DOWN kernel10: Scheduling flush 26-09-2013 10:57:54 <TRACE> kernel10: State changed to flush Pruning neighbors done Protocol start do_commit finished with 1 obstacles remaining Pruning route table master 26-09-2013 10:57:54 <TRACE> kernel10 > removed [sole] 192.168.10.0/24 blackhole Pruning route table master Flushing protocol kernel10 26-09-2013 10:57:54 <TRACE> kernel10: State changed to down Protocol kernel10 down kernel10 has shut down for reconfiguration +++ deleting obstacle 1 26-09-2013 10:57:54 <TRACE> kernel10: Initializing Kicking kernel10 up 26-09-2013 10:57:54 <TRACE> kernel10: Starting Allocating FIB hash of order 10: 1024 entries, 0 low, 4096 high kernel10 reporting state transition HUNGRY/DOWN -> */UP kernel10: Scheduling meal Connecting protocol kernel10 to table master ... KRT: Scanning routing table 26-09-2013 10:57:54 <TRACE> kernel10: Scanning routing table KRT: Got 192.168.10.0/24, type=6, oif=-1, table=10, prid=3, proto=kernel10 26-09-2013 10:57:54 <TRACE> kernel10: 192.168.10.0/24: [alien] created KRT: Got 192.168.20.0/24, type=6, oif=-1, table=20, prid=3, proto=kernel10 26-09-2013 10:57:54 <TRACE> kernel10: 192.168.20.0/24: [alien] created ************************************************************************** And route from KRT ID 10 is seen and accepted in nl_parse_route() at sysdep/linux/rtnetlink.c. ************************************************************************** ... KRT: Pruning table master 26-09-2013 10:57:54 <TRACE> kernel10: Pruning table master KRT: Pruning inherited routes 26-09-2013 10:57:54 <TRACE> kernel10: Pruning inherited routes 192.168.10.0/24: announcing (metric=0) 26-09-2013 10:57:54 <TRACE> kernel10 > added [best] 192.168.10.0/24 blackhole 26-09-2013 10:57:54 <TRACE> kernel10 < rejected by protocol 192.168.10.0/24 blackhole 192.168.20.0/24: announcing (metric=0) 26-09-2013 10:57:54 <TRACE> kernel10 > added [best] 192.168.20.0/24 blackhole 26-09-2013 10:57:54 <TRACE> kernel10 < rejected by protocol 192.168.20.0/24 blackhole 26-09-2013 10:57:54 <INFO> Reconfigured Feeding protocol kernel10 Feeding protocol kernel10 continued Announcing routes to new protocol kernel10 26-09-2013 10:57:54 <TRACE> kernel10 < rejected by protocol 192.168.10.0/24 blackhole 26-09-2013 10:57:54 <TRACE> kernel10 < rejected by protocol 192.168.20.0/24 blackhole 26-09-2013 10:57:54 <TRACE> kernel10: State changed to up Protocol kernel10 up and running After reconfiguration finishes, we see route from KRT ID 10 in BRT "master" table, to which syncer is attached. ------------------------------------------------------------------------------------------------ # birdc 'show route table master all' BIRD 1.3.11 ready. 192.168.10.0/24 blackhole [kernel10 10:57] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 192.168.20.0/24 blackhole [kernel10 10:57] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 And adding routes to KRT ID 10 when BIRD uses KRT ID 20 causes them to be stored in "master" table: --------------------------------------------------------------------------------------------------- # ip -4 route add blackhole 192.168.30.0/24 table 10 # birdc 'show route table master all' BIRD 1.3.11 ready. 192.168.10.0/24 blackhole [kernel10 10:57] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 192.168.20.0/24 blackhole [kernel10 10:57] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 192.168.30.0/24 blackhole [kernel10 10:58] * (10) Type: inherit unicast univ Kernel.source: 3 Kernel.metric: 0 And debugging output from BIRD ------------------------------------------ KRT: Received async route notification (24) KRT: Got 192.168.30.0/24, type=6, oif=-1, table=10, prid=3, proto=kernel10 26-09-2013 11:15:24 <TRACE> kernel10: 192.168.30.0/24: [alien async] created krt_learn_async: distributing change 26-09-2013 11:15:24 <TRACE> kernel10 > added [best] 192.168.30.0/24 blackhole 26-09-2013 11:15:24 <TRACE> kernel10 < rejected by protocol 192.168.30.0/24 blackhole There also possible to trigger crash, infinite loop (spot on bird 1.3.9 on the other system), but I have no instructions to reproduce this reliable. At least logging to syslog or to a file among with logging to stderr and changing name of the protocol (kernel10->kernel20 etc.) after route learned from other KRT, or wait for periodic KRT scan, possibly followed by protocol name change). -- SP5475-RIPE Sergey Popovich
On Thu, Sep 26, 2013 at 12:21:02PM +0300, Sergey Popovich wrote:
Helo, BIRD developers!
I have found few issues with kernel syncer on Linux with enabled route learning.
There are few problems in sysdep/linux/rtnetlink.c with netlink code:
* Use after free when accessing nl_table_map[] in nl_parse_route(). * Possible race condition between rx buffer allocation nl_async_rx_buffer in for async socket and nl_open_async() and it's usage in nl_async_hook().
This race condition couldn't really happen because of singlethreadness, but your change makes that cleaner.
* Socket descriptor leakage in error path in nl_open_async() when bind(2) call fails.
These problems addressed with patch bird-1.3.11-fix-shutdow~ath-in-rtnetlink.patch
Thanks, applied (both). -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
Ondrej Zajicek -
Sergey Popovich