[patch] Add contextual out-of-bound checks in RTR Prefix PDU handler
Hi, A broken RPKI Cache could contextually underflow or overflow the Max Length value in RTR IPv4 / IPv6 Prefix PDUs. Below is a changeset proposal which adds specific out-of-bounds checks and schedules a reconnect for a later moment. This aligns BIRD's behavior with other COTS BGP implementations. RFC 6810 and RFC 8210 specify that the "Max Length" value MUST NOT be less than the Prefix Length element (underflow). On the other side overflow of the Max Length element also is possible, it being an 8-bit unsigned integer allows for values larger than 32 or 128. When a PDU is received where the Max Length field is corrputed, the RTR client (BIRD) should immediately terminate the session, flush all data learned from that cache, and log an error for the operator. If this condition is hit, it seems better to follow the same approach as for RPKI_CS_ERROR_TRANSPORT, because rpki_force_restart_proto() causes an "as fast as possible" reconnect loop between BIRD and the RPKI Cache server, which increases resource consumption on both sides. Kind regards, Job diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..09e20f38 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,22 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr); + if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH || addr.roa4.max_pxlen < addr.roa4.pxlen) { + RPKI_WARN(cache->p, "Received a corrupted packet from cache server"); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, pdu, "Corrupted PDU"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH || addr.roa6.max_pxlen < addr.roa6.pxlen) { + RPKI_WARN(cache->p, "Received a corrupted packet from cache server"); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, pdu, "Corrupted PDU"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX) diff --git proto/rpki/rpki.c proto/rpki/rpki.c index ab0837f3..91b69da0 100644 --- proto/rpki/rpki.c +++ proto/rpki/rpki.c @@ -288,9 +288,6 @@ rpki_cache_change_state(struct rpki_cache *cache, const enum rpki_cache_state ne case RPKI_CS_ERROR_FATAL: /* Fatal protocol error occurred. */ - rpki_force_restart_proto(cache->p); - break; - case RPKI_CS_ERROR_TRANSPORT: /* Error on the transport socket occurred. */ rpki_close_connection(cache);
Here is an updated version of the changeset. The problematic PDU is now in the correct order echoed to the RTR Cache server, making troubleshooting with tcpdump/wireshark more productive! :) Kind regards, Job diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..3d024504 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,26 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr); + if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen < addr.roa4.pxlen || addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid Max Length"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, tmp, "Corrupted PDU"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen < addr.roa6.pxlen || addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid Max Length"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, tmp, "Corrupted PDU"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX) diff --git proto/rpki/rpki.c proto/rpki/rpki.c index ab0837f3..91b69da0 100644 --- proto/rpki/rpki.c +++ proto/rpki/rpki.c @@ -288,9 +288,6 @@ rpki_cache_change_state(struct rpki_cache *cache, const enum rpki_cache_state ne case RPKI_CS_ERROR_FATAL: /* Fatal protocol error occurred. */ - rpki_force_restart_proto(cache->p); - break; - case RPKI_CS_ERROR_TRANSPORT: /* Error on the transport socket occurred. */ rpki_close_connection(cache);
Apologies for the noise... perhaps three attempts is a charm? :-) The "Prefix Length" can also contextually overflow, the below changeset also checks that specific element. I noticed that having RPKI_CS_ERROR_FATAL fall through to RPKI_CS_ERROR_TRANSPORT did not result in the desired behavior (which is to flush all data related to the corrupted RPKI cache), so I left that change out. This means that a potential for a 'reconnect storm' remains. Adding a retry sleep timer before scheduling the next connect attempt if RPKI_CS_ERROR_FATAL happened should probably be investigated separately. Kind regards, Job diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..fa94846e 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,30 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr); + if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen < addr.roa4.pxlen + || addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH + || addr.roa4.pxlen > IP4_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid Max Length"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, tmp, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen < addr.roa6.pxlen + || addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH + || addr.roa6.pxlen > IP6_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid Max Length"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, tmp, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX)
Hi, I've aligned the text that is locally logged with the encapsulated error message sent to the broken RPKI cache. Also fixed a compiler warning that snuck into my previous patch: now passing the correct pointer (hton_pdu) to rpki_send_error_pdu(). Kind regards, Job diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..7a1eeb0f 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,30 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr); + if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen < addr.roa4.pxlen + || addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH + || addr.roa4.pxlen > IP4_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen < addr.roa6.pxlen + || addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH + || addr.roa6.pxlen > IP6_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX)
Ping :-) On Fri, 17 Sep 2021 at 21:34, Job Snijders <job@fastly.com> wrote:
Hi,
I've aligned the text that is locally logged with the encapsulated error message sent to the broken RPKI cache. Also fixed a compiler warning that snuck into my previous patch: now passing the correct pointer (hton_pdu) to rpki_send_error_pdu().
Kind regards,
Job
diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..7a1eeb0f 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,30 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr);
+ if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen < addr.roa4.pxlen + || addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH + || addr.roa4.pxlen > IP4_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen < addr.roa6.pxlen + || addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH + || addr.roa6.pxlen > IP6_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX)
On Sun, Nov 14, 2021 at 07:26:44PM +0100, Job Snijders wrote:
Ping :-)
Thanks, merged. Sorry for keeping it open for so long. btw, the original 2-condition patch was ok, because condition addr.roaX.pxlen <= IPX_MAX_PREFIX_LENGTH is deduced from transitivity.
On Fri, 17 Sep 2021 at 21:34, Job Snijders <job@fastly.com> wrote:
Hi,
I've aligned the text that is locally logged with the encapsulated error message sent to the broken RPKI cache. Also fixed a compiler warning that snuck into my previous patch: now passing the correct pointer (hton_pdu) to rpki_send_error_pdu().
Kind regards,
Job
diff --git proto/rpki/packets.c proto/rpki/packets.c index dd11f997..7a1eeb0f 100644 --- proto/rpki/packets.c +++ proto/rpki/packets.c @@ -737,6 +737,30 @@ rpki_handle_prefix_pdu(struct rpki_cache *cache, const struct pdu_header *pdu) net_addr_union addr = {}; rpki_prefix_pdu_2_net_addr(pdu, &addr);
+ if (type == IPV4_PREFIX) { + if (addr.roa4.max_pxlen < addr.roa4.pxlen + || addr.roa4.max_pxlen > IP4_MAX_PREFIX_LENGTH + || addr.roa4.pxlen > IP4_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } else { + if (addr.roa6.max_pxlen < addr.roa6.pxlen + || addr.roa6.max_pxlen > IP6_MAX_PREFIX_LENGTH + || addr.roa6.pxlen > IP6_MAX_PREFIX_LENGTH) { + RPKI_WARN(cache->p, "Received corrupt packet from RPKI cache server: invalid pxlen or max_pxlen"); + byte tmp[pdu->len]; + const struct pdu_header *hton_pdu = rpki_pdu_back_to_network_byte_order((void *) tmp, (const void *) pdu); + rpki_send_error_pdu(cache, CORRUPT_DATA, pdu->len, hton_pdu, "Corrupted PDU: invalid pxlen or max_pxlen"); + rpki_cache_change_state(cache, RPKI_CS_ERROR_FATAL); + return RPKI_ERROR; + } + } + if (cf->ignore_max_length) { if (type == IPV4_PREFIX)
-- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
Job Snijders -
Ondrej Zajicek