In this document, we define the notion of mempool and characterize its role in CometBFT. First, we provide an overview of what is a mempool, and relate it to other blockchains. Then, the interactions with the consensus and client application are detailed. A formalization of the mempool follows. This formalization is readable in Quint here.
The mempool acts as an entry point to consensus. It permits to disseminate transactions from one node to another, for their eventual inclusion into a block. To this end, the mempool maintains a replicated set, or pool, of transactions. Transactions in the mempool are consumed by consensus to create the next proposed block. Once a new block in the blockchain is decided, the mempool is refreshed. We shall detail how shortly.
A transaction can be received from a local client, or a remote disseminating process. Each transaction is subject to a test by the client application. This test verifies that the transaction is valid. Such a test provides some form of protection against byzantine agents, whether they be clients or other system nodes. It also serves to optimize the overall utility of the blockchain. Validity can be simply syntactical which is stateless, or a more complex verification that is state-dependent. If the transaction is valid, the local process further propagates it in the system using a gossip (or an anti-entropy) mechanism.
In other blockchains. The notion of mempool appears in all blockchains, but with varying definitions and/or implementations. For instance in Ethereum, the mempool contains two types of transactions: processable and pending ones. To be pending, a transactions must first succeed in a series of tests. Some of these tests are syntactic ones (e.g., valid source address), while others are state-dependent (e.g., enough gas, at most one pending transactions per address, etc). Narwhal is the mempool abstraction for the Tusk and Bullshark protocols. It provides strong global guarantees. In particular, once a transaction is added to the mempool, it is guaranteed to be available at any later point in time.
In what follows, we present the interactions of the mempool with other parts of CometBFT.
Some of the specificities of the current implementation (CListMempool
) are also detailed.
RPC server To add a new transaction to the mempool, a client submits it through an appropriate RPC endpoint. This endpoint is offered by some of the system nodes (but not necessarily all of them).
Gossip protocol Transactions can also be received from other nodes, through a gossiping mechanism.
ABCI application
As pointed out above, the mempool should only store and disseminate valid transactions.
It is up to the ABCI (client) application to define whether a transaction is valid.
Transactions received locally are sent to the application to be validated, through the checkTx
method from the mempool ABCI connection.
Such a check indicates with a flag whether it is the first time (or not) that the transaction is sent for validation.
Transactions that are validated by the application are later added to the mempool.
Transactions tagged as invalid are simply dropped.
The validity of a transaction may depend on the state of the client application.
In particular, some transactions that are valid in some state of the application may later become invalid.
The state of the application is updated when consensus commits a block of transactions.
When this happens, the transactions still in the mempool have to be validated again.
We further detail this mechanism below.
Consensus The consensus protocol consumes transactions stored in the mempool to build blocks to be proposed. To this end, consensus requests from the mempool a list of transactions. A limit on the total number of bytes, or transactions, may be specified. In the current implementation, the mempool is stored as a list of transactions. The call returns the longest prefix of the list that matches the imposed limits. Notice that at this point the transactions returned to consensus are not removed from the mempool. This comes from the fact that the block is proposed but not decided yet.
Proposing a block is the prerogative of the nodes acting as validators.
At all the full nodes (validators or not), consensus is responsible for committing blocks of transactions to the blockchain.
Once a block is committed, all the transactions included in the block are removed from the mempool.
This happens with an update
call to the mempool.
Before doing this call, CometBFT takes a lock
on the mempool.
Then, it flush
the connection with the client application.
When flush
returns, all the pending validation requests are answered and/or dropped.
Both operations aim at preventing any concurrent checkTx
while the mempool is updated.
At the end of update
, all the transactions still in the mempool are re-validated (asynchronously) against the new state of the client application.
This procedure is executed with a call to recheckTxs
.
Finally, consensus removes its lock on the mempool by issuing a call to unlock
.
In what follows, we formalize the notion of mempool.
To this end, we first provide a (brief) definition of what is a ledger, that is a replicated log of transactions.
At a process
Ledger.
We use the standard definition of (BFT) SMR in the context of blockchain, where each process
As standard, the ledger ensures that:
-
(Gap-freedom) There is no gap between two entries at a correct process:
$\forall i \in \mathbb{N}. \forall p \in Correct. \square(p.ledger[i] \neq \bot \implies (i=0 \vee p.ledger[i-1] \neq \bot))$ ; -
(Agreement) No two correct processes have different ledger entries; formally:
$\forall i \in \mathbb{N}. \forall p,q \in Correct. \square((p.ledger[i] = \bot) \vee (q.ledger[i] = \bot) \vee (p.ledger[i] = q.ledger[i]))$ ; -
(Validity) If some transaction appears at an index
$i$ at a correct process, then a process submitted it at that index:
$\forall p \in Correct. \exists q \in Processes. \forall i \in \mathbb{N}. \square(tx \in p.ledger[i] \implies tx \in \bigcup_q q.submitted[i]$ ). -
(Termination) If a correct process submits a block at its current height, eventually its height get incremented:
$\forall p \in Correct. \square((h=p.height \wedge p.submitted[h] \neq \varnothing) \implies \lozenge(p.height>h))$
Mempool.
A mempool is a replicated set of transactions.
At a process
Only the mempool is used as an input for the ledger:
INV1.
Committed transactions are not in the mempool:
INV2.
In blockchain, a transaction is (or not) valid in a given state.
That is, a transaction can be valid (or not) at a given height of the ledger.
To model this, consider a transaction
INV3.
Finally, we require some progress from the mempool.
Namely, if a transaction appears at a correct process then eventually it is committed or forever invalid.
INV4
The above invariant ensures that if a transaction enters the mempool (at a correct process), then it eventually leaves it at a later time.
For this to be true, the client application must ensure that the validity of a transaction converges toward some value.
This means that there exists a height after which
Practical considerations. Invariants INV2 and INV3 require to atomically update the mempool when transactions are newly committed. To maintain such invariants in an implementation, standard thread-safe mechanisms (e.g., monitors and locks) can be used.
Another practical concern is that INV2 requires to traverse the whole ledger, which might be too expensive.
Instead, we would like to maintain this only over the last
INV2a.
INV3 requires to have a green light from the client application before adding a transaction to the mempool.
For efficiency, such a validation needs to be made at most
INV3a.
For further information regarding the current implementation of the mempool in CometBFT, the reader may consult this document in the knowledge base.