Vulnerability Disclosure: Memory Exhaustion DoS in Core Lightning

Summary

Following up on my previous disclosure, I am releasing details on a second Denial-of-Service (DoS) vulnerability in Core Lightning (CLN) discovered during my Summer of Bitcoin 2025 internship, which allowed a remote peer to trigger unbounded memory growth in the connect daemon (connectd), leading to an Out-of-Memory (OOM) system crash.

This vulnerability has been patched in Core Lightning release v26.04. All node operators are advised to upgrade accordingly.

Background

Core Lightning (CLN) uses a multi-daemon architecture, meant to isolate faults. The Lightning Network relies on a “gossip” protocol to propagate channel announcements, channel updates, node announcements, etc. In Core Lightning (CLN), the gossip daemon is responsible for managing this global view of the network. It receives messages from connectd and processes them.

The network can be noisy, so connectd must be efficient. Beecause it processes external untrusted input, it must also be robust against malicious messages.

Discovery

This vulnerability was discovered using a new fuzz target I developed, fuzz-gossipd-connectd, aimed at testing the robustness of gossipd’s state machines. Here is a brief overview of how the target works:

/* 0 - In each fuzz run, perform the following steps: */
void run(const u8* data, size_t size) {

  /* 1 - Mock the required setup for connectd-gossipd communication. */
  initialize_setup(data, size);

  /* 2 - State loop. Repeat until there's no more fuzzer data left. */
  while (size) {

  /* 3 - Use a fuzzer byte as decision variable to perform one of the three possible operations: */
  switch (consume_int(data) % OP_COUNT) {

    case NEW PEER:
      /* 3.1 - Register a fake peer to attribute messages to. */
      connectd_new_peer(daemon, create_random_peer_id());
  
    case RECV GOSSIP:
      /* 3.2.1 - Generate a cryptographically valid gossip. Can be one of: */
      /* - Channel Announcements */
      /* - Channel Updates */
      /* - Node Announcements */
      /* - Reply channel_range */
      /* - Reply short_channel_ids_end */
      msg = create_gossip_msg(data, size, random_peer());

      /* 3.2.2 - Parse the gossip created. */
      handle_recv_gossip(daemon, msg);

    case PEER GONE:
      /* 3.3 - Delete a peer from the network map. */
      connectd_peer_gone(daemon, remove_random_peer());
  }
}

For more details about the target, see the correspoding Pull Request.

Vulnerability

A remote attacker can trigger a denial-of-service (DoS) condition in the connectd sub-daemon by flooding it with a high volume of channel_update gossip messages. This causes an internal message queue within connectd to grow without bounds, allocating memory for each new message. The process consumes all available system RAM, leading to extreme system unresponsiveness and a complete freeze (swap death), effectively crashing the node.

The root cause is a resource management issue in connectd-gossipd’s inter-daemon message queue. do_enqueue() allocates a new copy of every incoming message. Under a high-volume message flood, messages are enqueued far faster than they are dequeued and processed. This leads to unbounded memory growth.

A warning for excessive queue length exists but is only triggered once due to a warned_once flag, giving no further indication of the ongoing memory consumption as it grows into gigabytes.

Verification

An attack program was created to verify the vulnerability. The program acts as a malicious Lightning Network peer that connects to a CLN node and orchestrates a message flood to trigger the crash:

Initialization and Connection: The attacker connects to the victim CLN node.
Initial Handshake: The attacker completes the init message handshake with the victim.
The Malicious Flood: The attacker begins sending a continuous, high-volume stream of channel_update gossip messages to the victim node.
Unbounded Queue Growth and Crash:
- connectd receives the messages and places them into an internal msg_queue.
- The queue’s length quickly exceeds 250,000 items, causing a single excessive queue length backtrace to be logged.
- As the flood continues, the queue grows into the millions, with each message consuming more memory. The warned_once flag prevents further warnings.
- The memory consumption of the connectd process grows until all physical RAM is exhausted and the system freezes completely.

Criticality

Severity: Medium/High (DoS)
Attack Vector: Remote (Peer-to-Peer)
Consequence: A remote attacker can reliably crash any publicly accessible CLN node by flooding it with peer-to-peer traffic. This requires no on-chain funds or channel relationship, making it a potent vector for disrupting network liquidity.

Fix

The reasoning behind storing channel_update messages internally is to not lose out on gossip for a channel that we haven’t discovered yet. This allows us to faster reach the latest state of a channel whose channel_announcement comes in later, eliminating the need for the channel’s peer to rebroadcast the gossips.

However, as storing too many of these gossips can lead to this vulnerability, the optimal fix (as per @rustyrussel) is to drop messages after a certain cutoff point, decided to be 500,000.

More details can be accessed at the fix’s Pull Request.

Timeline

22/07/2025: Target pushed upstream.
22/07/2025-31/07/2025: Improvements made to the target, dealing with false positives.
01/08/2025: Vulnerability confirmed using the attack program. Vulnerability disclosed to Matt.
10/08/2025: Investigation perfromed. Cause of the memory usage pinpointed.
11/08/2025: Vulnerability confirmed by Matt.
12/08/2025: Draft of the vulnerability report created. Final report sent to CLN’s security mailing list.
14/08/2025: Reply by Rusty confirming the issue and disclosing the supposed fix.
18/08/2025: Fix merged to master.

Lessons Learned

This bug emphasizes that internal interfaces are attack surfaces too. Even if connectd handles some throttling, it must also defend itself against internal floods. Fuzzing the IPC layer (Daemon-to-Daemon) is just as critical as fuzzing the external Network layer.

Special thanks to my mentor Matt Morehouse for guiding the entire process, especially the triage and disclosure.