Celestia App Specifications
Specification
Data Structures
Data Structures Overview
Type Aliases
name | type |
---|---|
Address | byte[32] |
Amount | uint64 |
Graffiti | byte[MAX_GRAFFITI_BYTES] |
HashDigest | byte[32] |
Height | int64 |
NamespaceID | byte[NAMESPACE_ID_BYTES] |
Nonce | uint64 |
Round | int32 |
StateSubtreeID | byte |
Timestamp | google.protobuf.Timestamp |
VotingPower | uint64 |
Blockchain Data Structures
Block
Blocks are the top-level data structure of the Celestia blockchain.
name | type | description |
---|---|---|
header | Header | Block header. Contains primarily identification info and commitments. |
availableDataHeader | AvailableDataHeader | Header of available data. Contains commitments to erasure-coded data. |
availableData | AvailableData | Data that is erasure-coded for availability. |
lastCommit | Commit | Previous block's Tendermint commit. |
Header
Block header, which is fully downloaded by both full clients and light clients.
name | type | description |
---|---|---|
version | ConsensusVersion | The consensus version struct. |
chainID | string | The CHAIN_ID . |
height | Height | Block height. The genesis block is at height 1 . |
timestamp | Timestamp | Timestamp of this block. |
lastHeaderHash | HashDigest | Previous block's header hash. |
lastCommitHash | HashDigest | Previous block's Tendermint commit hash. |
consensusHash | HashDigest | Hash of consensus parameters for this block. |
stateCommitment | HashDigest | The state root after this block's transactions are applied. |
availableDataOriginalSharesUsed | uint64 | The number of shares used in the original data square that are not tail padding. |
availableDataRoot | HashDigest | Root of commitments to erasure-coded data. |
proposerAddress | Address | Address of this block's proposer. |
The size of the original data square, availableDataOriginalSquareSize
, isn't explicitly declared in the block header. Instead, it is implicitly computed as the smallest power of 2 whose square is at least availableDataOriginalSharesUsed
(in other words, the smallest power of 4 that is at least availableDataOriginalSharesUsed
).
The header hash is the hash of the serialized header.
AvailableDataHeader
name | type | description |
---|---|---|
rowRoots | HashDigest[] | Commitments to all erasure-coded data. |
colRoots | HashDigest[] | Commitments to all erasure-coded data. |
The number of row/column roots of the original data shares in square layout for this block. The availableDataRoot
of the header is computed using the compact row and column roots as described here.
The number of row and column roots is each availableDataOriginalSquareSize * 2
, and must be a power of 2. Note that the minimum availableDataOriginalSquareSize
is 1 (not 0), therefore the number of row and column roots are each at least 2.
Implementations can prune rows containing only tail padding as they are implicitly available.
AvailableData
Data that is erasure-coded for data availability checks.
name | type | description |
---|---|---|
transactionData | TransactionData | Transaction data. Transactions modify the validator set and balances, and pay fees for messages to be included. |
intermediateStateRootData | IntermediateStateRootData | Intermediate state roots used for fraud proofs. |
payForBlobData | PayForBlobData | PayForBlob data. Transactions that pay for blobs to be included. |
messageData | MessageData | Message data. Messages are app data. |
Commit
name | type | description |
---|---|---|
height | Height | Block height. |
round | Round | Round. Incremented on view change. |
headerHash | HashDigest | Header hash of the previous block. |
signatures | CommitSig[] | List of signatures. |
Timestamp
Timestamp is a type alias.
Celestia uses google.protobuf.Timestamp
to represent time.
HashDigest
HashDigest is a type alias.
Output of the hashing function. Exactly 256 bits (32 bytes) long.
TransactionFee
name | type | description |
---|---|---|
tipRate | uint64 | The tip rate for this transaction. |
Abstraction over transaction fees.
Address
Address is a type alias.
Addresses are the hash digest of the public key.
Addresses have a length of 32 bytes.
CommitSig
enum CommitFlag : uint8_t {
CommitFlagAbsent = 1,
CommitFlagCommit = 2,
CommitFlagNil = 3,
};
name | type | description |
---|---|---|
commitFlag | CommitFlag | |
validatorAddress | Address | |
timestamp | Timestamp | |
signature | Signature |
Signature
name | type | description |
---|---|---|
r | byte[32] | r value of the signature. |
vs | byte[32] | 1-bit v value followed by last 255 bits of s value of signature. |
Output of the signing process.
ConsensusVersion
name | type | description |
---|---|---|
block | uint64 | The VERSION_BLOCK . |
app | uint64 | The VERSION_APP . |
Serialization
Objects that are committed to or signed over require a canonical serialization. This is done using a deterministic (and thus, bijective) variant of protobuf defined here.
Note: there are two requirements for a serialization scheme, should this need to be changed:
- Must be bijective.
- Serialization must include the length of dynamic structures (e.g. arrays with variable length).
Hashing
All protocol-level hashing is done using SHA-2-256 as defined in FIPS 180-4. SHA-2-256 outputs a digest that is 256 bits (i.e. 32 bytes) long.
Libraries implementing SHA-2-256 are available in Go (https://pkg.go.dev/crypto/sha256) and Rust (https://docs.rs/sha2).
Unless otherwise indicated explicitly, objects are first serialized before being hashed.
Public-Key Cryptography
Consensus-critical data is authenticated using ECDSA, with the curve secp256k1. A highly-optimized library is available in C (https://github.com/bitcoin-core/secp256k1), with wrappers in Go (https://pkg.go.dev/github.com/ethereum/go-ethereum/crypto/secp256k1) and Rust (https://docs.rs/crate/secp256k1).
Public keys are encoded in uncompressed form, as the concatenation of the x
and y
values. No prefix is needed to distinguish between encoding schemes as this is the only encoding supported.
Deterministic signatures (RFC-6979) should be used when signing, but this is not enforced at the protocol level as it cannot be.
Signatures are represented as the r
and s
(each 32 bytes), and v
(1-bit) values of the signature. r
and s
take on their usual meaning (see: SEC 1, 4.1.3 Signing Operation), while v
is used for recovering the public key from a signature more quickly (see: SEC 1, 4.1.6 Public Key Recovery Operation). Only low-s
values in signatures are valid (i.e. s <= secp256k1.n//2
); s
can be replaced with -s mod secp256k1.n
during the signing process if it is high. Given this, the first bit of s
will always be 0
, and can be used to store the 1-bit v
value.
v
represents the parity of the Y
component of the point, 0
for even and 1
for odd. The X
component of the point is assumed to always be low, since the possibility of it being high is negligible.
Putting it all together, the encoding for signatures is:
| 32 bytes || 32 bytes |
[256-bit r value][1-bit v value][255-bit s value]
This encoding scheme is derived from EIP 2098: Compact Signature Representation.
Merkle Trees
Merkle trees are used to authenticate various pieces of data across the Celestia stack, including transactions, messages, the validator set, etc. This section provides an overview of the different tree types used, and specifies how to construct them.
Binary Merkle Tree
Binary Merkle trees are constructed in the same fashion as described in Certificate Transparency (RFC-6962), except for using a different hashing function. Leaves are hashed once to get leaf node values and internal node values are the hash of the concatenation of their children (either leaf nodes or other internal nodes).
Nodes contain a single field:
name | type | description |
---|---|---|
v | HashDigest | Node value. |
The base case (an empty tree) is defined as the hash of the empty string:
node.v = 0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
For leaf node node
of leaf data d
:
node.v = h(0x00, serialize(d))
For internal node node
with children l
and r
:
node.v = h(0x01, l.v, r.v)
Note that rather than duplicating the last node if there are an odd number of nodes (the Bitcoin design), trees are allowed to be imbalanced. In other words, the height of each leaf may be different. For an example, see Section 2.1.3 of Certificate Transparency (RFC-6962).
Leaves and internal nodes are hashed differently: the one-byte 0x00
is prepended for leaf nodes while 0x01
is prepended for internal nodes. This avoids a second-preimage attack where internal nodes are presented as leaves trees with leaves at different heights.
BinaryMerkleTreeInclusionProof
name | type | description |
---|---|---|
siblings | HashDigest[] | Sibling hash values, ordered starting from the leaf's neighbor. |
A proof for a leaf in a binary Merkle tree, as per Section 2.1.1 of Certificate Transparency (RFC-6962).
Namespace Merkle Tree
Shares in Celestia are associated with a provided namespace ID. The Namespace Merkle Tree (NMT) is a variation of the Merkle Interval Tree, which is itself an extension of the Merkle Sum Tree. It allows for compact proofs around the inclusion or exclusion of shares with particular namespace IDs.
Nodes contain three fields:
name | type | description |
---|---|---|
n_min | NamespaceID | Min namespace ID in subtree rooted at this node. |
n_max | NamespaceID | Max namespace ID in subtree rooted at this node. |
v | HashDigest | Node value. |
The base case (an empty tree) is defined as:
node.n_min = 0x0000000000000000
node.n_max = 0x0000000000000000
node.v = 0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
For leaf node node
of share data d
:
node.n_min = d.namespaceID
node.n_max = d.namespaceID
node.v = h(0x00, d.namespaceID, d.rawData)
The namespaceID
message field here is the namespace ID of the leaf, which is a NAMESPACE_ID_BYTES
-long byte array.
Leaves in an NMT must be lexicographically sorted by namespace ID in ascending order.
For internal node node
with children l
and r
:
node.n_min = min(l.n_min, r.n_min)
if l.n_min == PARITY_SHARE_NAMESPACE_ID
node.n_max = PARITY_SHARE_NAMESPACE_ID
else if r.n_min == PARITY_SHARE_NAMESPACE_ID
node.n_max = l.n_max
else
node.n_max = max(l.n_max, r.n_max)
node.v = h(0x01, l.n_min, l.n_max, l.v, r.n_min, r.n_max, r.v)
Note that the above snippet leverages the property that leaves are sorted by namespace ID: if l.n_min
is PARITY_SHARE_NAMESPACE_ID
, so must {l,r}.n_max
. By construction, either both the min and max namespace IDs of a node will be PARITY_SHARE_NAMESPACE_ID
, or neither will: if r.n_min
is PARITY_SHARE_NAMESPACE_ID
, so must r.n_max
.
For some intuition: the min and max namespace IDs for subtree roots with at least one non-parity leaf (which includes the root of an NMT, as the right half of an NMT as used in Celestia will be parity shares) ignore the namespace ID for the parity leaves. Subtree roots with only parity leaves have their min and max namespace ID set to PARITY_SHARE_NAMESPACE_ID
. This allows for shorter proofs into the tree than if the namespace ID of parity shares was not ignored (which would cause the max namespace ID of the root to always be PARITY_SHARE_NAMESPACE_ID
).
A compact commitment can be computed by taking the hash of the serialized root node.
NamespaceMerkleTreeInclusionProof
name | type | description |
---|---|---|
siblingValues | HashDigest[] | Sibling hash values, ordered starting from the leaf's neighbor. |
siblingMins | NamespaceID[] | Sibling min namespace IDs. |
siblingMaxes | NamespaceID[] | Sibling max namespace IDs. |
When verifying an NMT proof, the root hash is checked by reconstructing the root node root_node
with the computed root_node.v
(computed as with a plain Merkle proof) and the provided rootNamespaceIDMin
and rootNamespaceIDMax
as the root_node.n_min
and root_node.n_max
, respectively.
Sparse Merkle Tree
Sparse Merkle Trees (SMTs) are sparse, i.e. they contain mostly empty leaves. They can be used as key-value stores for arbitrary data, as each leaf is keyed by its index in the tree. Storage efficiency is achieved through clever use of implicit defaults, avoiding the need to store empty leaves.
Additional rules are added on top of plain binary Merkle trees:
- Default values are given to leaf nodes with empty leaves.
- While the above rule is sufficient to pre-compute the values of intermediate nodes that are roots of empty subtrees, a further simplification is to extend this default value to all nodes that are roots of empty subtrees. The 32-byte zero, i.e.
0x0000000000000000000000000000000000000000000000000000000000000000
, is used as the default value. This rule takes precedence over the above one. - The number of hashing operations can be reduced to be logarithmic in the number of non-empty leaves on average, assuming a uniform distribution of non-empty leaf keys. An internal node that is the root of a subtree that contains exactly one non-empty leaf is replaced by that leaf's leaf node.
Nodes contain a single field:
name | type | description |
---|---|---|
v | HashDigest | Node value. |
In the base case, where a sparse Merkle tree has height = 0
, the root of a tree is defined as the hash of the empty string:
node.v = 0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
When a sparse Merkle tree has a height of 0, it can have no leaves, and, therefore, no default value children. The root is then calculated as the hash of the empty string, similar to that of an empty binary Merkle tree.
For a tree with height > 0
, the root of an empty tree is defined as the default value:
node.v = 0x0000000000000000000000000000000000000000000000000000000000000000
Note that this is in contrast to the base case of the sparse and binary Merkle trees, where the root is the hash of the empty string. When a sparse Merkle tree has a height greater than 0, a new tree instance is composed of default value leaves. Nodes containing only default value children have the default value as well. Applying these rules recursively percolates the default value up to the tree's root.
For leaf node node
of leaf data d
with key k
:
node.v = h(0x00, k, h(serialize(d)))
The key of leaf nodes must be prepended, since the index of a leaf node that is not at maximum depth cannot be determined without this information. Leaf values are hashed so that they do not need to be included in full in non-membership proofs.
For internal node node
with children l
and r
:
node.v = h(0x01, l.v, r.v)
SparseMerkleTreeInclusionProof
SMTs can further be extended with compact proofs. Merkle proofs are composed, among other things, of a list of sibling node values. We note that, since nodes that are roots of empty subtrees have known values (the default value), these values do not need to be provided explicitly; it is sufficient to simply identify which siblings in the Merkle branch are roots of empty subtrees, which can be done with one bit per sibling.
For a Merkle branch of height h
, an h
-bit value is appended to the proof. The lowest bit corresponds to the sibling of the leaf node, and each higher bit corresponds to the next parent. A value of 1
indicates that the next value in the list of values provided explicitly in the proof should be used, and a value of 0
indicates that the default value should be used.
A proof into an SMT is structured as:
name | type | description |
---|---|---|
depth | uint16 | Depth of the leaf node. The root node is at depth 0 . Must be <= 256 . |
siblings | HashDigest[] | Sibling hash values, ordered starting from the leaf's neighbor. |
includedSiblings | byte[32] | Bitfield of explicitly included sibling hashes. |
The includedSiblings
is ordered by most-significant-byte first, with each byte ordered by most-significant-bit first. The lowest bit corresponds to the leaf node level.
Erasure Coding
In order to enable trust-minimized light clients (i.e. light clients that do not rely on an honest majority of validating state assumption), it is critical that light clients can determine whether the data in each block is available or not, without downloading the whole block itself. The technique used here was formally described in the paper Fraud and Data Availability Proofs: Maximising Light Client Security and Scaling Blockchains with Dishonest Majorities.
The remainder of the subsections below specify the 2D Reed-Solomon erasure coding scheme used, along with the format of shares and how available data is arranged into shares.
Reed-Solomon Erasure Coding
Note that while data is laid out in a two-dimensional square, rows and columns are erasure coded using a standard one-dimensional encoding.
Reed-Solomon erasure coding is used as the underlying coding scheme. The parameters are:
- 16-bit Galois field
availableDataOriginalSquareSize
original pieces (maximum ofAVAILABLE_DATA_ORIGINAL_SQUARE_MAX
)availableDataOriginalSquareSize
parity pieces (maximum ofAVAILABLE_DATA_ORIGINAL_SQUARE_MAX
) (i.eavailableDataOriginalSquareSize * 2
total pieces), for an erasure efficiency of 50%. In other words, any 50% of the pieces from theavailableDataOriginalSquareSize * 2
total pieces are enough to recover the original data.SHARE_SIZE
bytes per piece
Note that availableDataOriginalSquareSize
may vary each block, and is decided by the block proposer of that block. Leopard-RS is a C library that implements the above scheme with quasilinear runtime.
2D Reed-Solomon Encoding Scheme
The 2-dimensional data layout is described in this section. The roots of NMTs for each row and column across four quadrants of data in a 2k * 2k
matrix of shares, Q0
to Q3
(shown below), must be computed. In other words, 2k
row roots and 2k
column roots must be computed. The row and column roots are stored in the availableDataCommitments
of the AvailableDataHeader.
The data of Q0
is the original data, and the remaining quadrants are parity data. Setting k = availableDataOriginalSquareSize
, the original data first must be split into shares and arranged into a k * k
matrix. Then the parity data can be computed.
Where A -> B
indicates that B
is computed using erasure coding from A
:
Q0 -> Q1
for each row inQ0
andQ1
Q0 -> Q2
for each column inQ0
andQ2
Q2 -> Q3
for each row inQ2
andQ3
Note that the parity data in Q3
will be identical if it is vertically extended from Q1
or horizontally extended from Q2
.
As an example, the parity data in the second column of Q2
(in striped purple) is computed by extending the original data in the second column of Q0
(in solid blue).
Now that all four quadrants of the 2k * 2k
matrix are filled, the row and column roots can be computed. To do so, each row/column is used as the leaves of a NMT, for which the compact root is computed (i.e. an extra hash operation over the NMT root is used to produce a single HashDigest). In this example, the fourth row root value is computed as the NMT root of the fourth row of Q0
and the fourth row of Q1
as leaves.
Finally, the availableDataRoot
of the block Header is computed as the Merkle root of the binary Merkle tree with the row and column roots as leaves, in that order.
Share
name | type | description |
---|---|---|
namespaceID | NamespaceID | Namespace ID of the share. |
rawData | byte[SHARE_SIZE] | Raw share data. |
A share is a fixed-size data chunk associated with a namespace ID, whose data will be erasure-coded and committed to in Namespace Merkle trees.
A sequence is a contiguous set of shares that contain semantically relevant data. A sequence should be parsed together because data may be split across share boundaries. One sequence exists per reserved namespace and per message.
- The first
NAMESPACE_ID_BYTES
of a share's raw datarawData
is the namespace ID of that share,namespaceID
. - The next
SHARE_INFO_BYTES
bytes are for share information with the following structure:- The first 7 bits represent the share version in big endian form (initially, this will be
0000000
for version0
); - The last bit is a sequence start indicator, that is
1
if the share is at the start of a sequence or0
if it is a continuation share.
- The first 7 bits represent the share version in big endian form (initially, this will be
The remainder of a share's raw data rawData
is interpreted differently depending on the namespace ID.
Compact Share
For shares with a reserved namespace ID through NAMESPACE_ID_MAX_RESERVED
:
Note The first
NAMESPACE_ID_BYTES
of a share's raw datarawData
is the namespace ID of that share,namespaceID
. The nextSHARE_INFO_BYTES
bytes are for share information.
- If this is the first share of a sequence, the next
SEQUENCE_BYTES
contain a big endianuint32
that represents the length of the sequence that follows in bytes. - The next
SHARE_RESERVED_BYTES
bytes are the starting byte of the length of the canonically serialized first request that starts in the share, or0
if there is none, as an unsigned varint. - The remaining
SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_INFO_BYTES
-
SEQUENCE_BYTES
bytes (only if this is the first share of a sequence)-
SHARE_RESERVED_BYTES
bytes are transactions, intermediate state roots, or PayForBlob transaction data. Each transaction, intermediate state root, or PayForBlob transaction is prefixed with a varint of the length of that unit. - If there is insufficient transaction, intermediate state root, or PayForBlob transaction data to fill the share, the remaining bytes are filled with
0
.
First share in a sequence:
where reserved bytes would be 17
as a binary big endian uint32
([0b00000000, 0b00000000, 0b00000000, 0b00010001]
).
Continuation share in a sequence:
where reserved bytes would be 80
as a binary big endian uint32
([0b00000000, 0b00000000, 0b00000000, 0b01010000]
).
Sparse Share
For shares with a namespace ID above NAMESPACE_ID_MAX_RESERVED
but below PARITY_SHARE_NAMESPACE_ID
:
Note The first
NAMESPACE_ID_BYTES
of a share's raw datarawData
is the namespace ID of that share,namespaceID
. The nextSHARE_INFO_BYTES
bytes are for share information.
- If this is the first share of a sequence, the next
SEQUENCE_BYTES
contain a big endianuint32
that represents the length of the sequence that follows in bytes. - The remaining
SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_INFO_BYTES
-
SEQUENCE_BYTES
bytes (only if this is the first share of a sequence) bytes are message data. Message data are opaque bytes of data that are included in the block but do not impact the state. In other words, the remaining bytes have no special meaning and are simply used to store data. - If there is insufficient message data to fill the share, the remaining bytes are filled with
0
.
First share in a sequence:
Continuation share in a sequence:
Parity Share
For shares with a namespace ID equal to PARITY_SHARE_NAMESPACE_ID
(i.e. parity shares):
- Bytes carry no special meaning.
Namespace Padding Share
A namespace padding share acts as padding between blobs so that the subsequent blob may begin at an index that conforms to the non-interactive default rules. A namespace padding share contains the namespace ID of the blob that precedes it in the data square so that the data square can retain the property that all shares are ordered by namespace.
The first NAMESPACE_ID_BYTES
of a share's raw data rawData
is the namespace ID of the blob that precedes this padding share. The next SHARE_INFO_BYTES
bytes are for share information. The sequence start indicator is always 1
. The version bits are filled with the share version. The sequence length is zeroed out. The remaining SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_INFO_BYTES
-
SEQUENCE_BYTES
bytes are filled with 0
.
Reserved Padding Share
Reserved padding shares are placed after the last reserved namespace share in the data square so that the first blob can start at an index that conforms to non-interactive default rules. Clients can safely ignore the contents of these shares because they don't contain any significant data.
For shares with a namespace ID equal to RESERVED_PADDING_NAMESPACE_ID
(i.e. reserved padding shares):
The first NAMESPACE_ID_BYTES
of a share's raw data rawData
is the namespace ID of that share, namespaceID
. The next SHARE_INFO_BYTES
bytes are for share information. The sequence start indicator is always 1
. The version bits are filled with the share version. The sequence length is zeroed out. The remaining SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_INFO_BYTES
-
SEQUENCE_BYTES
bytes are filled with 0
.
Tail Padding Share
Tail padding shares are placed after the last blob in the data square so that the number of shares in the data square is a perfect square. Clients can safely ignore the contents of these shares because they don't contain any significant data.
For shares with a namespace ID equal to TAIL_PADDING_NAMESPACE_ID
(i.e. tail padding shares):
The first NAMESPACE_ID_BYTES
of a share's raw data rawData
is the namespace ID of that share, namespaceID
. The next SHARE_INFO_BYTES
bytes are for share information. The sequence start indicator is always 1
. The version bits are filled with the share version. The sequence length is zeroed out. The remaining SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_INFO_BYTES
-
SEQUENCE_BYTES
bytes are filled with 0
.
Arranging Available Data Into Shares
The previous sections described how some original data, arranged into a k * k
matrix, can be extended into a 2k * 2k
matrix and committed to with NMT roots. This section specifies how available data (which includes transactions, intermediate state roots, PayForBlob transactions, and messages) is arranged into the matrix in the first place.
Then,
- For each of
transactionData
,intermediateStateRootData
, PayForBlob transactions, serialize:- For each request in the list:
- Serialize the request (individually).
- Compute the length of each serialized request, serialize the length, and pre-pend the serialized request with its serialized length.
- Split up the length/request pairs into
SHARE_SIZE
-
NAMESPACE_ID_BYTES
-
SHARE_RESERVED_BYTES
-byte chunks. - Create a share out of each chunk. This data has a reserved namespace ID, so the first
NAMESPACE_ID_BYTES
+
SHARE_RESERVED_BYTES
bytes for these shares must be set specially.
- For each request in the list:
- Concatenate the lists of shares in the order: transactions, intermediate state roots, PayForBlob transactions.
Note that by construction, each share only has a single namespace, and that the list of concatenated shares is lexicographically ordered by namespace ID.
These shares are arranged in the first quadrant (Q0
) of the availableDataOriginalSquareSize*2 * availableDataOriginalSquareSize*2
available data matrix in row-major order. In the example below, each reserved data element takes up exactly one share.
Each message in the list messageData
:
- Serialize the message (individually).
- Compute the length of each serialized message, serialize the length, and pre-pend the serialized message with its serialized length.
- Split up the length/message pairs into
SHARE_SIZE
-
NAMESPACE_ID_BYTES
-byte chunks. - Create a share out of each chunk. The first
NAMESPACE_ID_BYTES
bytes for these shares is set to the namespace ID.
For each message, it is placed in the available data matrix, with row-major order, as follows:
- Place the first share of the message at the next unused location in the matrix, then place the remaining shares in the following locations.
Transactions must commit to a Merkle root of a list of hashes that are each guaranteed (assuming the block is valid) to be subtree roots in one or more of the row NMTs. For additional info, see the rationale document for this section.
However, with only the rule above, interaction between the block producer and transaction sender may be required to compute a commitment to the message the transaction sender can sign over. To remove interaction, messages can optionally be laid out using a non-interactive default:
- Place the first share of the message at the next unused location in the matrix whose column is aligned with the largest power of 2 that is not larger than the message length or
availableDataOriginalSquareSize
, then place the remaining shares in the following locations unless there are insufficient unused locations in the row. - If there are insufficient unused locations in the row, place the first share of the message at the first column of the next row. Then place the remaining shares in the following locations. By construction, any message whose length is greater than
availableDataOriginalSquareSize
will be placed in this way.
In the example below, two messages (of lengths 2 and 1, respectively) are placed using the aforementioned default non-interactive rules.
The non-interactive default rules may introduce empty shares that do not belong to any message (in the example above, the top-right share is empty). These are zeroes with namespace ID equal to the either TAIL_TRANSACTION_PADDING_NAMESPACE_ID
if between a request with a reserved namespace ID and a message, or the namespace ID of the previous message if succeeded by a message. See the rationale doc for more info.
Available Data
TransactionData
name | type | description |
---|---|---|
wrappedTransactions | WrappedTransaction[] | List of wrapped transactions. |
WrappedTransaction
Wrapped transactions include additional metadata by the block proposer that is committed to in the available data matrix.
name | type | description |
---|---|---|
index | uint64 | Index of this transaction in the list of wrapped transactions. This information is lost when splitting transactions into fixed-sized shares, and needs to be re-added here for fraud proof support. Allows linking a transaction to an intermediate state root. |
transaction | Transaction | Actual transaction. |
messageStartIndex | uint64 | Optional, only used if transaction pays for a message or padding. Share index (in row-major order) of first share of message this transaction pays for. Needed for light verification of proper message inclusion. |
Transaction
name | type | description |
---|---|---|
signedTransactionData | SignedTransactionData | Data payload that is signed over. |
signature | Signature | Signature. |
SignedTransactionData
enum TransactionType : uint8_t {
Transfer = 1,
MsgPayForData = 2,
CreateValidator = 3,
BeginUnbondingValidator = 4,
UnbondValidator = 5,
CreateDelegation = 6,
BeginUnbondingDelegation = 7,
UnbondDelegation = 8,
Burn = 9,
RedelegateCommission = 10,
RedelegateReward = 11,
};
Signed transaction data comes in a number of types:
- Transfer
- MsgPayForData
- CreateValidator
- BeginUnbondingValidator
- UnbondValidator
- CreateDelegation
- BeginUnbondingDelegation
- UnbondDelegation
- Burn
- RedelegateCommission
- RedelegateReward
Common fields are denoted here to avoid repeating descriptions:
name | type | description |
---|---|---|
type | TransactionType | Type of the transaction. Each type indicates a different state transition. |
amount | Amount | Amount of coins to send, in 1u . |
to | Address | Recipient's address. |
fee | TransactionFee | The fee information for this transaction. |
nonce | Nonce | Nonce of sender. |
SignedTransactionDataTransfer
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.Transfer . |
amount | Amount | |
to | Address | |
fee | TransactionFee | |
nonce | Nonce |
Transfers amount
coins to to
.
SignedTransactionDataMsgPayForData
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.MsgPayForData . |
fee | TransactionFee | |
nonce | Nonce | |
messageNamespaceID | NamespaceID | Namespace ID of message this transaction pays a fee for. |
messageSize | uint32 | Size of message this transaction pays a fee for, in byte s. |
messageShareCommitment | HashDigest | Commitment to message shares (details below). |
Pays for the inclusion of a message in the same block.
The commitment to message shares messageShareCommitment
is a Merkle root of message share roots. Each message share root is a subtree root in a row NMT. For rationale, see rationale doc.
SignedTransactionDataCreateValidator
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.CreateValidator . |
fee | TransactionFee | |
nonce | Nonce | |
commissionRate | Decimal |
Create a new Validator at this address.
SignedTransactionDataBeginUnbondingValidator
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.BeginUnbondingValidator . |
fee | TransactionFee | |
nonce | Nonce |
Begin unbonding the Validator at this address.
SignedTransactionDataUnbondValidator
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.UnbondValidator . |
fee | TransactionFee | |
nonce | Nonce |
Finish unbonding the Validator at this address.
SignedTransactionDataCreateDelegation
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.CreateDelegation . |
amount | Amount | |
to | Address | |
fee | TransactionFee | |
nonce | Nonce |
Create a new Delegation of amount
coins worth of voting power for validator with address to
.
SignedTransactionDataBeginUnbondingDelegation
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.BeginUnbondingDelegation . |
fee | TransactionFee | |
nonce | Nonce |
Begin unbonding the Delegation at this address.
SignedTransactionDataUnbondDelegation
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.UnbondDelegation . |
fee | TransactionFee | |
nonce | Nonce |
Finish unbonding the Delegation at this address.
SignedTransactionDataBurn
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.Burn . |
amount | Amount | |
fee | TransactionFee | |
nonce | Nonce | |
graffiti | Graffiti | Graffiti to indicate the reason for burning. |
SignedTransactionRedelegateCommission
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.RedelegateCommission . |
to | Address | |
fee | TransactionFee | |
nonce | Nonce |
Assigns validator's pending commission to a delegation.
SignedTransactionRedelegateReward
name | type | description |
---|---|---|
type | TransactionType | Must be TransactionType.RedelegateReward . |
fee | TransactionFee | |
nonce | Nonce |
Adds delegation's pending rewards to voting power.
PayForBlobData
IntermediateStateRootData
name | type | description |
---|---|---|
wrappedIntermediateStateRoots | WrappedIntermediateStateRoot[] | List of wrapped intermediate state roots. |
WrappedIntermediateStateRoot
name | type | description |
---|---|---|
index | uint64 | Index of this intermediate state root in the list of intermediate state roots. This information is lost when splitting intermediate state roots into fixed-sized shares, and needs to be re-added here for fraud proof support. Allows linking an intermediate state root to a transaction. |
intermediateStateRoot | IntermediateStateRoot | Intermediate state root. Used for fraud proofs. |
IntermediateStateRoot
name | type | description |
---|---|---|
root | HashDigest | Root of intermediate state, which is composed of the global state and the validator set. |
MessageData
name | type | description |
---|---|---|
messages | Message[] | List of messages. |
Message
name | type | description |
---|---|---|
namespaceID | NamespaceID | Namespace ID of this message. |
rawData | byte[] | Raw message bytes. |
State
The state of the Celestia chain is intentionally restricted to containing only account balances and the validator set metadata. One unified Sparse Merkle Tree is maintained for the entire chain state, the state tree. The root of this tree is committed to in the block header.
The state tree is separated into 2**(8*STATE_SUBTREE_RESERVED_BYTES)
subtrees, each of which can be used to store a different component of the state. This is done by slicing off the highest STATE_SUBTREE_RESERVED_BYTES
bytes from the key and replacing them with the appropriate reserved state subtree ID. Reducing the key size within subtrees also reduces the collision resistance of keys by 8*STATE_SUBTREE_RESERVED_BYTES
bits, but this is not an issue due the number of bits removed being small.
A number of subtrees are maintained:
StateElement
Data structure for state elements is given below:
name | type | description |
---|---|---|
key | byte[32] | Keys are byte arrays with size 32. |
value | Account, Delegation, Validator, ActiveValidatorCount, ActiveVotingPower, ProposerBlockReward, ProposerInitialVotingPower, ValidatorQueueHead, MessagePaidHead | value can be of different types depending on the state elements listed below. There exists a unique protobuf for different state elements. |
Account
enum AccountStatus : uint8_t {
None = 1,
DelegationBonded = 2,
DelegationUnbonding = 3,
ValidatorQueued = 4,
ValidatorBonded = 5,
ValidatorUnbonding = 6,
ValidatorUnbonded = 7,
};
name | type | description |
---|---|---|
balance | Amount | Coin balance. |
nonce | Nonce | Account nonce. Every outgoing transaction from this account increments the nonce. |
status | AccountStatus | Validator or delegation status of this account. |
The status
of an account indicates weather it is a validator (AccountStatus.Validator*
), delegating to a validator (AccountStatus.Delegation*
), or neither (AccountStatus.None
). Being a validator and delegating are mutually exclusive, and only a single validator can be delegated to.
Delegations have two statuses:
DelegationBonded
: This delegation is enabled for aQueued
orBonded
validator. Delegations to aQueued
validator can be withdrawn immediately, while delegations for aBonded
validator must be unbonded first.DelegationUnbonding
: This delegation is unbonding. It will remain in this status for at leastUNBONDING_DURATION
blocks, and while unbonding may still be slashed. Once the unbonding duration has expired, the delegation can be withdrawn.
Validators have four statuses:
ValidatorQueued
: This validator has entered the queue to become an active validator. Once the next validator set transition occurs, if this validator has sufficient voting power (including its own stake and stake delegated to it) to be in the topMAX_VALIDATORS
validators by voting power, it will become an active, i.e.ValidatorBonded
validator. Until bonded, this validator can immediately exit the queue.ValidatorBonded
: This validator is active and bonded. It can propose new blocks and vote on proposed blocks. Once bonded, an active validator must go through an unbonding process until its stake can be freed.ValidatorUnbonding
: This validator is in the process of unbonding, which can be voluntary (the validator decided to stop being an active validator) or forced (the validator committed a slashable offence and was kicked from the active validator set). Validators will remain in this status for at leastUNBONDING_DURATION
blocks, and while unbonding may still be slashed.ValidatorUnbonded
: This validator has completed its unbonding and has withdrawn its stake. The validator object will remain in this status untildelegatedCount
reaches zero, at which point it is destroyed.
In the accounts subtree, accounts (i.e. leaves) are keyed by the hash of their address. The first byte is then replaced with ACCOUNTS_SUBTREE_ID
.
Delegation
name | type | description |
---|---|---|
validator | Address | The validator being delegating to. |
stakedBalance | VotingPower | Delegated stake, in 4u . |
beginEntry | PeriodEntry | Entry when delegation began. |
endEntry | PeriodEntry | Entry when delegation ended (i.e. began unbonding). |
unbondingHeight | Height | Block height delegation began unbonding. |
Delegation objects represent a delegation.
In the delegation subtree, delegations are keyed by the hash of their address. The first byte is then replaced with DELEGATIONS_SUBTREE_ID
.
Validator
name | type | description |
---|---|---|
commissionRewards | uint64 | Validator's commission rewards, in 1u . |
commissionRate | Decimal | Commission rate. |
delegatedCount | uint32 | Number of accounts delegating to this validator. |
votingPower | VotingPower | Total voting power as staked balance + delegated stake, in 4u . |
pendingRewards | Amount | Rewards collected so far this period, in 1u . |
latestEntry | PeriodEntry | Latest entry, used for calculating reward distribution. |
unbondingHeight | Height | Block height validator began unbonding. |
isSlashed | bool | If this validator has been slashed or not. |
slashRate | Decimal | Optional, only if isSlashed is set. Rate at which this validator has been slashed. |
next | Address | Next validator in the queue. Zero if this validator is not in the queue. |
Validator objects represent all the information needed to be keep track of a validator.
In the validators subtrees, validators are keyed by the hash of their address. The first byte is then replaced with ACTIVE_VALIDATORS_SUBTREE_ID
for the active validator set or INACTIVE_VALIDATORS_SUBTREE_ID
for the inactive validator set. Active validators are bonded, (i.e. ValidatorBonded
), while inactive validators are not bonded (i.e. ValidatorUnbonded
). By construction, the validators subtrees will be a subset of a mirror of the accounts subtree.
The validator queue (i.e. validators with status ValidatorQueued
) is a subset of the inactive validator set. This queue is represented as a linked list, with each validator pointing to the next
validator in the queue, and the head of the linked list stored in ValidatorQueueHead.
ActiveValidatorCount
name | type | description |
---|---|---|
numValidators | uint32 | Number of active validators. |
Since the active validator set is stored in a Sparse Merkle Tree, there is no compact way of proving that the number of active validators exceeds MAX_VALIDATORS
without keeping track of the number of active validators. The active validator count is stored in the active validators subtree, and is keyed with 0
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000000
), with the first byte replaced with ACTIVE_VALIDATORS_SUBTREE_ID
.
ActiveVotingPower
name | type | description |
---|---|---|
votingPower | uint64 | Active voting power. |
Since the active validator set is stored in a Sparse Merkle Tree, there is no compact way of proving the active voting power. The active voting power is stored in the active validators subtree, and is keyed with 1
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000001
), with the first byte replaced with ACTIVE_VALIDATORS_SUBTREE_ID
.
ProposerBlockReward
name | type | description |
---|---|---|
reward | uint64 | Total block reward (subsidy + fees) in current block so far. Reset each block. |
The current block reward for the proposer is kept track of here. This is keyed with 2
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000002
), with the first byte replaced with ACTIVE_VALIDATORS_SUBTREE_ID
.
ProposerInitialVotingPower
name | type | description |
---|---|---|
votingPower | uint64 | Voting power of the proposer at the start of each block. Set each block. |
The proposer's voting power at the beginning of the block is kept track of here. This is keyed with 3
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000003
), with the first byte replaced with ACTIVE_VALIDATORS_SUBTREE_ID
.
ValidatorQueueHead
name | type | description |
---|---|---|
head | Address | Address of inactive validator at the head of the validator queue. |
The head of the queue for validators that are waiting to become active validators is stored in the inactive validators subtree, and is keyed with 0
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000000
), with the first byte replaced with INACTIVE_VALIDATORS_SUBTREE_ID
.
If the queue is empty, head
is set to the default value (i.e. the hash of the leaf is the default value for a Sparse Merkle Tree).
PeriodEntry
name | type | description |
---|---|---|
rewardRate | Amount | Rewards per unit of voting power accumulated so far, in 1u . |
Decimal
name | type | description |
---|---|---|
numerator | uint64 | Rational numerator. |
denominator | uint64 | Rational denominator. |
Represents a (potentially) non-integer number.
MessagePaid
name | type | description |
---|---|---|
start | uint64 | Share index (in row-major order) of first share paid for (inclusive). |
finish | uint64 | Share index (in row-major order) of last share paid for (inclusive). |
next | HashDigest | Next transaction ID in the list. |
MessagePaidHead
name | type | description |
---|---|---|
head | HashDigest | Transaction hash at the head of the list (has the smallest start index). |
The head of the list of paid message shares is stored in the message share paid subtree, and is keyed with 0
(i.e. 0x0000000000000000000000000000000000000000000000000000000000000000
), with the first byte replaced with MESSAGE_PAID_SUBTREE_ID
.
If the paid list is empty, head
is set to the default value (i.e. the hash of the leaf is the default value for a Sparse Merkle Tree).
Consensus Parameters
Various consensus parameters are committed to in the block header, such as limits and constants.
name | type | description |
---|---|---|
version | ConsensusVersion | The consensus version struct. |
chainID | string | The CHAIN_ID . |
shareSize | uint64 | The SHARE_SIZE . |
shareReservedBytes | uint64 | The SHARE_RESERVED_BYTES . |
availableDataOriginalSquareMax | uint64 | The AVAILABLE_DATA_ORIGINAL_SQUARE_MAX . |
In order to compute the consensusHash
field in the block header, the above list of parameters is hashed.
Consensus Rules
System Parameters
Units
name | SI | value | description |
---|---|---|---|
1u | 1u | 10**0 | 1 unit. |
2u | k1u | 10**3 | 1000 units. |
3u | M1u | 10**6 | 1000000 units. |
4u | G1u | 10**9 | 1000000000 units. |
Constants
name | type | value | unit | description |
---|---|---|---|---|
AVAILABLE_DATA_ORIGINAL_SQUARE_MAX | uint64 | share | Maximum number of rows/columns of the original data shares in square layout. | |
AVAILABLE_DATA_ORIGINAL_SQUARE_TARGET | uint64 | share | Target number of rows/columns of the original data shares in square layout. | |
BLOCK_TIME | uint64 | second | Block time, in seconds. | |
CHAIN_ID | string | "Celestia" | Chain ID. Each chain assigns itself a (unique) ID. | |
GENESIS_COIN_COUNT | uint64 | 10**8 | 4u | (= 100000000) Number of coins at genesis. |
MAX_GRAFFITI_BYTES | uint64 | 32 | byte | Maximum size of transaction graffiti, in bytes. |
MAX_VALIDATORS | uint16 | 64 | Maximum number of active validators. | |
NAMESPACE_ID_BYTES | uint64 | 8 | byte | Size of namespace ID, in bytes. |
NAMESPACE_ID_MAX_RESERVED | uint64 | 255 | Value of maximum reserved namespace ID (inclusive). 1 byte worth of IDs. | |
SEQUENCE_BYTES | uint64 | 4 | byte | The number of bytes used to store the sequence length in the first share of a sequence |
SHARE_INFO_BYTES | uint64 | 1 | byte | The number of bytes used for share information |
SHARE_RESERVED_BYTES | uint64 | 4 | byte | The number of bytes used to store the location of the first unit in a compact share. Must be able to represent any integer up to and including SHARE_SIZE - 1 . |
SHARE_SIZE | uint64 | 512 | byte | Size of transaction and message shares, in bytes. |
STATE_SUBTREE_RESERVED_BYTES | uint64 | 1 | byte | Number of bytes reserved to identify state subtrees. |
UNBONDING_DURATION | uint32 | block | Duration, in blocks, for unbonding a validator or delegation. | |
VERSION_APP | uint64 | 1 | Version of the Celestia application. Breaking changes (hard forks) must update this parameter. | |
VERSION_BLOCK | uint64 | 1 | Version of the Celestia chain. Breaking changes (hard forks) must update this parameter. |
Reserved Namespace IDs
name | type | value | description |
---|---|---|---|
TRANSACTION_NAMESPACE_ID | NamespaceID | 0x0000000000000001 | Transactions: requests that modify the state. |
INTERMEDIATE_STATE_ROOT_NAMESPACE_ID | NamespaceID | 0x0000000000000002 | Intermediate state roots, committed after every transaction. |
EVIDENCE_NAMESPACE_ID | NamespaceID | 0x0000000000000003 | Evidence: fraud proofs or other proof of slashable action. |
RESERVED_PADDING_NAMESPACE_ID | NamespaceID | 0x00000000000000FF | Padding after all reserved namespaces but before blobs. |
TAIL_PADDING_NAMESPACE_ID | NamespaceID | 0xFFFFFFFFFFFFFFFE | Tail padding for messages: padding after all messages to fill up the original data square. |
PARITY_SHARE_NAMESPACE_ID | NamespaceID | 0xFFFFFFFFFFFFFFFF | Parity shares: extended shares in the available data matrix. |
Reserved State Subtree IDs
name | type | value |
---|---|---|
ACCOUNTS_SUBTREE_ID | StateSubtreeID | 0x01 |
ACTIVE_VALIDATORS_SUBTREE_ID | StateSubtreeID | 0x02 |
INACTIVE_VALIDATORS_SUBTREE_ID | StateSubtreeID | 0x03 |
DELEGATIONS_SUBTREE_ID | StateSubtreeID | 0x04 |
MESSAGE_PAID_SUBTREE_ID | StateSubtreeID | 0x05 |
Rewards and Penalties
name | type | value | unit | description |
---|---|---|---|---|
SECONDS_PER_YEAR | uint64 | 31536000 | second | Seconds per year. Omit leap seconds. |
TARGET_ANNUAL_ISSUANCE | uint64 | 2 * 10**6 | 4u | (= 2000000) Target number of coins to issue per year. |
Leader Selection
TODO
Fork Choice
The Tendermint consensus protocol is fork-free by construction under an honest majority of stake assumption.
If a block has a valid commit, it is part of the canonical chain. If equivocation evidence is detected for more than 1/3 of voting power, the node must halt.
Block Validity
The validity of a newly-seen block, block
, is determined by two components, detailed in subsequent sections:
- Block structure: whether the block header is valid, and data in a block is arranged into a valid and matching data root (i.e. syntax).
- State transition: whether the application of transactions in the block produces a matching and valid state root (i.e. semantics).
Pseudocode in this section is not in any specific language and should be interpreted as being in a neutral and sane language.
Block Structure
Before executing state transitions, the structure of the block must be verified.
The following block fields are acquired from the network and parsed (i.e. deserialized). If they cannot be parsed, the block is ignored but is not explicitly considered invalid by consensus rules. Further implications of ignoring a block are found in the networking spec.
If the above fields are parsed successfully, the available data block.availableData
is acquired in erasure-coded form as a list of share rows, then parsed. If it cannot be parsed, the block is ignored but not explicitly invalid, as above.
block.header
The block header block.header
(header
for short) is the first thing that is downloaded from the new block, and commits to everything inside the block in some way. For previous block prev
(if prev
is not known, then the block is ignored), and previous block header prev.header
, the following checks must be true
:
availableDataOriginalSquareSize
is computed as described here.
header.height
==prev.header.height + 1
.header.timestamp
>prev.header.timestamp
.header.lastHeaderHash
== the header hash ofprev
.header.lastCommitHash
== the hash oflastCommit
.header.consensusHash
== the value computed here.header.stateCommitment
== the root of the state, computed with the application of all state transitions in this block.availableDataOriginalSquareSize
<=AVAILABLE_DATA_ORIGINAL_SQUARE_MAX
.header.availableDataRoot
== the Merkle root of the tree with the row and column roots ofblock.availableDataHeader
as leaves.header.proposerAddress
== the leader forheader.height
.
block.availableDataHeader
The available data header block.availableDataHeader
(availableDataHeader
for short) is then processed. This commits to the available data, which is only downloaded after the consensus commit is processed. The following checks must be true
:
- Length of
availableDataHeader.rowRoots
==availableDataOriginalSquareSize * 2
. - Length of
availableDataHeader.colRoots
==availableDataOriginalSquareSize * 2
. - The length of each element in
availableDataHeader.rowRoots
andavailableDataHeader.colRoots
must be32
.
block.lastCommit
The last commit block.lastCommit
(lastCommit
for short) is processed next. This is the Tendermint commit (i.e. polka of votes) for the previous block. For previous block prev
and previous block header prev.header
, the following checks must be true
:
lastCommit.height
==prev.header.height
.lastCommit.round
>=1
.lastCommit.headerHash
== the header hash ofprev
.- Length of
lastCommit.signatures
<=MAX_VALIDATORS
. - Each of
lastCommit.signatures
must be a valid CommitSig - The sum of the votes for
prev
inlastCommit
must be at least 2/3 (rounded up) of the voting power ofprev
's next validator set.
block.availableData
The block's available data (analogous to transactions in contemporary blockchain designs) block.availableData
(availableData
for short) is finally processed. The list of share rows is parsed into the actual data structures using the reverse of the process to encode available data into shares; if parsing fails here, the block is invalid.
Once parsed, the following checks must be true
:
- The commitments of the erasure-coded extended
availableData
must match those inheader.availableDataHeader
. Implicitly, this means that both rows and columns must be ordered lexicographically by namespace ID since they are committed to in a Namespace Merkle Tree. - Length of
availableData.intermediateStateRootData
== length ofavailableData.transactionData
+ length ofavailableData.payForBlobData
+ 2. (Two additional state transitions are the begin and end block implicit transitions.)
State Transitions
Once the basic structure of the block has been validated, state transitions must be applied to compute the new state and state root.
For this section, the variable state
represents the state tree, with state.accounts[k]
, state.inactiveValidatorSet[k]
, state.activeValidatorSet[k]
, and state.delegationSet[k]
being shorthand for the leaf in the state tree in the accounts, inactive validator set, active validator set, and delegation set subtrees with pre-hashed key k
. E.g. state.accounts[a]
is shorthand for state[(ACCOUNTS_SUBTREE_ID << 8*(32-STATE_SUBTREE_RESERVED_BYTES)) | ((-1 >> 8*STATE_SUBTREE_RESERVED_BYTES) & hash(a))]
.
State transitions are applied in the following order:
block.availableData.transactionData
Transactions are applied to the state. Note that transactions mutate the state (essentially, the validator set and minimal balances), while messages do not.
block.availableData.transactionData
is simply a list of WrappedTransactions. For each wrapped transaction in this list, wrappedTransaction
, with index i
(starting from 0
), the following checks must be true
:
wrappedTransaction.index
==i
.
For wrappedTransaction
's transaction transaction
, the following checks must be true
:
transaction.signature
must be a valid signature overtransaction.signedTransactionData
.
Finally, each wrappedTransaction
is processed depending on its transaction type. These are specified in the next subsections, where tx
is short for transaction.signedTransactionData
, and sender
is the recovered signing address. We will define a few helper functions:
tipCost(y, z) = y * z
totalCost(x, y, z) = x + tipCost(y, z)
where x
above is the amount of coins sent by the transaction authorizer, y
above is the tip rate set in the transaction, and z
above is the measure of the block space used by the transaction (i.e. size in bytes).
Four additional helper functions are defined to manage the validator queue:
findFromQueue(power)
, which returns the address of the last validator in the validator queue with voting power greater than or equal topower
, or0
if the queue is empty or no validators in the queue have at leastpower
voting power.parentFromQueue(address)
, which returns the address of the parent in the validator queue of the validator with addressaddress
, or0
ifaddress
is not in the queue or is the head of the queue.validatorQueueInsert
, defined as
function validatorQueueInsert(validator)
# Insert the new validator into the linked list
parent = findFromQueue(validator.votingPower)
if parent != 0
if state.accounts[parent].status == AccountStatus.ValidatorBonded
validator.next = state.activeValidatorSet[parent].next
state.activeValidatorSet[parent].next = sender
else
validator.next = state.inactiveValidatorSet[parent].next
state.inactiveValidatorSet[parent].next = sender
else
validator.next = state.validatorQueueHead
state.validatorQueueHead = sender
validatorQueueRemove
, defined as
function validatorQueueRemove(validator, sender)
# Remove existing validator from the linked list
parent = parentFromQueue(sender)
if parent != 0
if state.accounts[parent].status == AccountStatus.ValidatorBonded
state.activeValidatorSet[parent].next = validator.next
validator.next = 0
else
state.inactiveValidatorSet[parent].next = validator.next
validator.next = 0
else
state.validatorQueueHead = validator.next
validator.next = 0
Note that light clients cannot perform a linear search through a linked list, and are instead provided logarithmic proofs (e.g. in the case of parentFromQueue
, a proof to the parent is provided, which should have address
as its next validator).
In addition, three helper functions to manage the message paid list:
findFromMessagePaidList(start)
, which returns the transaction ID of the last transaction in the message paid list withfinish
greater thanstart
, or0
if the list is empty or no transactions in the list have at leaststart
finish
.parentFromMessagePaidList(txid)
, which returns the transaction ID of the parent in the message paid list of the transaction with IDtxid
, or0
iftxid
is not in the list or is the head of the list.messagePaidListInsert
, defined as
function messagePaidListInsert(tx, txid)
# Insert the new transaction into the linked list
parent = findFromMessagePaidList(tx.messageStartIndex)
state.messagesPaid[txid].start = tx.messageStartIndex
numShares = ceil(tx.messageSize / SHARE_SIZE)
state.messagesPaid[txid].finish = tx.messageStartIndex + numShares - 1
if parent != 0
state.messagesPaid[txid].next = state.messagesPaid[parent].next
state.messagesPaid[parent].next = txid
else
state.messagesPaid[txid].next = state.messagePaidHead
state.messagePaidHead = txid
We define a helper function to compute F1 entries:
function compute_new_entry(reward, power)
if power == 0
return 0
return reward // power
After applying a transaction, the new state state root is computed.
SignedTransactionDataTransfer
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.Transfer
.totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
state.accounts[tx.to].balance += tx.amount
state.activeValidatorSet.proposerBlockReward += tipCost(bytesPaid)
SignedTransactionDataMsgPayForData
bytesPaid = len(tx) + tx.messageSize
currentStartFinish = state.messagesPaid[findFromMessagePaidList(tx.messageStartIndex)]
parentStartFinish = state.messagesPaid[parentFromMessagePaidList(findFromMessagePaidList(tx.messageStartIndex))]
The following checks must be true
:
tx.type
==TransactionType.MsgPayForData
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.- The
ceil(tx.messageSize / SHARE_SIZE)
shares starting at indextx.messageStartIndex
must:- Have namespace ID
tx.messageNamespaceID
.
- Have namespace ID
tx.messageShareCommitment
== computed as described here.parentStartFinish.finish
<tx.messageStartIndex
.currentStartFinish.start
==0
orcurrentStartFinish.start
>tx.messageStartIndex + ceil(tx.messageSize / SHARE_SIZE)
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
messagePaidListInsert(tx, id(tx))
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataCreateValidator
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.CreateValidator
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.tx.commissionRate.denominator > 0
.tx.commissionRate.numerator <= tx.commissionRate.denominator
.state.accounts[sender].status
==AccountStatus.None
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = AccountStatus.ValidatorQueued
validator = new Validator
validator.commissionRate = tx.commissionRate
validator.delegatedCount = 0
validator.votingPower = 0
validator.pendingRewards = 0
validator.latestEntry = PeriodEntry(0)
validator.unbondingHeight = 0
validator.isSlashed = false
validatorQueueInsert(validator)
state.inactiveValidatorSet[sender] = validator
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataBeginUnbondingValidator
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.BeginUnbondingValidator
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.ValidatorQueued
orstate.accounts[sender].status
==AccountStatus.ValidatorBonded
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = ValidatorStatus.Unbonding
if state.accounts[sender].status == AccountStatus.ValidatorQueued
validator = state.inactiveValidatorSet[sender]
else if state.accounts[sender].status == AccountStatus.ValidatorBonded
validator = state.activeValidatorSet[sender]
delete state.activeValidatorSet[sender]
validator.unbondingHeight = block.height + 1
validator.latestEntry += compute_new_entry(validator.pendingRewards, validator.votingPower)
validator.pendingRewards = 0
validatorQueueRemove(validator, sender)
state.inactiveValidatorSet[sender] = validator
state.activeValidatorSet.activeVotingPower -= validator.votingPower
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataUnbondValidator
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.UnbondValidator
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.ValidatorUnbonding
.state.inactiveValidatorSet[sender].unbondingHeight + UNBONDING_DURATION
<block.height
.
Apply the following to the state:
validator = state.inactiveValidatorSet[sender]
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = AccountStatus.ValidatorUnbonded
state.accounts[sender].balance += validator.commissionRewards
state.inactiveValidatorSet[sender] = validator
if validator.delegatedCount == 0
state.accounts[sender].status = AccountStatus.None
delete state.inactiveValidatorSet[sender]
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataCreateDelegation
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.CreateDelegation
.totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.state.accounts[tx.to].status
==AccountStatus.ValidatorQueued
orstate.accounts[tx.to].status
==AccountStatus.ValidatorBonded
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.None
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = AccountStatus.DelegationBonded
if state.accounts[tx.to].status == AccountStatus.ValidatorQueued
validator = state.inactiveValidatorSet[tx.to]
else if state.accounts[tx.to].status == AccountStatus.ValidatorBonded
validator = state.activeValidatorSet[tx.to]
delegation = new Delegation
delegation.status = DelegationStatus.Bonded
delegation.validator = tx.to
delegation.stakedBalance = tx.amount
delegation.beginEntry = validator.latestEntry
delegation.endEntry = PeriodEntry(0)
delegation.unbondingHeight = 0
validator.latestEntry += compute_new_entry(validator.pendingRewards, validator.votingPower)
validator.pendingRewards = 0
validator.delegatedCount += 1
validator.votingPower += tx.amount
# Update the validator in the linked list by first removing then inserting
validatorQueueRemove(validator, delegation.validator)
validatorQueueInsert(validator)
state.delegationSet[sender] = delegation
if state.accounts[tx.to].status == AccountStatus.ValidatorQueued
state.inactiveValidatorSet[tx.to] = validator
else if state.accounts[tx.to].status == AccountStatus.ValidatorBonded
state.activeValidatorSet[tx.to] = validator
state.activeValidatorSet.activeVotingPower += tx.amount
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataBeginUnbondingDelegation
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.BeginUnbondingDelegation
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.DelegationBonded
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = AccountStatus.DelegationUnbonding
delegation = state.delegationSet[sender]
if state.accounts[delegation.validator].status == AccountStatus.ValidatorQueued ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonding ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonded
validator = state.inactiveValidatorSet[delegation.validator]
else if state.accounts[delegation.validator].status == AccountStatus.ValidatorBonded
validator = state.activeValidatorSet[delegation.validator]
delegation.status = DelegationStatus.Unbonding
delegation.endEntry = validator.latestEntry
delegation.unbondingHeight = block.height + 1
validator.latestEntry += compute_new_entry(validator.pendingRewards, validator.votingPower)
validator.pendingRewards = 0
validator.delegatedCount -= 1
validator.votingPower -= delegation.stakedBalance
# Update the validator in the linked list by first removing then inserting
# Only do this if the validator is actually in the queue (i.e. bonded or queued)
if state.accounts[delegation.validator].status == AccountStatus.ValidatorBonded ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorQueued
validatorQueueRemove(validator, delegation.validator)
validatorQueueInsert(validator)
state.delegationSet[sender] = delegation
if state.accounts[delegation.validator].status == AccountStatus.ValidatorQueued ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonding ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonded
state.inactiveValidatorSet[delegation.validator] = validator
else if state.accounts[delegation.validator].status == AccountStatus.ValidatorBonded
state.activeValidatorSet[delegation.validator] = validator
state.activeValidatorSet.activeVotingPower -= delegation.stakedBalance
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataUnbondDelegation
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.UnbondDelegation
.totalCost(0, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.DelegationUnbonding
.state.delegationSet[sender].unbondingHeight + UNBONDING_DURATION
<block.height
.
Apply the following to the state:
delegation = state.accounts[sender].delegationInfo
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
state.accounts[sender].status = None
# Return the delegated stake
state.accounts[sender].balance += delegation.stakedBalance
# Also disperse rewards (commission has already been levied)
state.accounts[sender].balance += delegation.stakedBalance * (delegation.endEntry - delegation.beginEntry)
if state.accounts[delegation.validator].status == AccountStatus.ValidatorQueued ||
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonding
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonded
validator = state.inactiveValidatorSet[delegation.validator]
else if state.accounts[delegation.validator].status == AccountStatus.ValidatorBonded
validator = state.activeValidatorSet[delegation.validator]
if validator.delegatedCount == 0 &&
state.accounts[delegation.validator].status == AccountStatus.ValidatorUnbonded
state.accounts[delegation.validator].status = AccountStatus.None
delete state.inactiveValidatorSet[delegation.validator]
delete state.accounts[sender].delegationInfo
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionDataBurn
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.Burn
.totalCost(tx.amount, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(tx.amount, tx.fee.tipRate, bytesPaid)
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionRedelegateCommission
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.RedelegateCommission
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[tx.to].status
==AccountStatus.DelegationBonded
.state.accounts[sender].status
==AccountStatus.ValidatorBonded
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
delegation = state.delegationSet[tx.to]
validator = state.activeValidatorSet[delegation.validator]
# Force-redelegate pending rewards for delegation
pendingRewards = delegation.stakedBalance * (validator.latestEntry - delegation.beginEntry)
delegation.stakedBalance += pendingRewards
delegation.beginEntry = validator.latestEntry
validator.latestEntry += compute_new_entry(validator.pendingRewards, validator.votingPower)
validator.pendingRewards = 0
# Assign pending commission rewards to delegation
commissionRewards = validator.commissionRewards
delegation.stakedBalance += commissionRewards
validator.commissionRewards = 0
# Update voting power
validator.votingPower += pendingRewards + commissionRewards
state.activeValidatorSet.activeVotingPower += pendingRewards + commissionRewards
state.delegationSet[tx.to] = delegation
state.activeValidatorSet[delegation.validator] = validator
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
SignedTransactionRedelegateReward
bytesPaid = len(tx)
The following checks must be true
:
tx.type
==TransactionType.RedelegateReward
.totalCost(0, tx.fee.tipRate, bytesPaid)
<=state.accounts[sender].balance
.tx.nonce
==state.accounts[sender].nonce + 1
.state.accounts[sender].status
==AccountStatus.DelegationBonded
.state.accounts[state.delegationSet[sender].validator].status
==AccountStatus.ValidatorBonded
.
Apply the following to the state:
state.accounts[sender].nonce += 1
state.accounts[sender].balance -= totalCost(0, tx.fee.tipRate, bytesPaid)
delegation = state.delegationSet[sender]
validator = state.activeValidatorSet[delegation.validator]
# Redelegate pending rewards for delegation
pendingRewards = delegation.stakedBalance * (validator.latestEntry - delegation.beginEntry)
delegation.stakedBalance += pendingRewards
delegation.beginEntry = validator.latestEntry
validator.latestEntry += compute_new_entry(validator.pendingRewards, validator.votingPower)
validator.pendingRewards = 0
# Update voting power
validator.votingPower += pendingRewards
state.activeValidatorSet.activeVotingPower += pendingRewards
state.delegationSet[sender] = delegation
state.activeValidatorSet[delegation.validator] = validator
state.activeValidatorSet.proposerBlockReward += tipCost(tx.fee.tipRate, bytesPaid)
Begin Block
At the beginning of the block, rewards are distributed to the block proposer.
Apply the following to the state:
proposer = state.activeValidatorSet[block.header.proposerAddress]
# Compute block subsidy and save to state for use in end block.
rewardFactor = (TARGET_ANNUAL_ISSUANCE * BLOCK_TIME) / (SECONDS_PER_YEAR * sqrt(GENESIS_COIN_COUNT))
blockReward = rewardFactor * sqrt(state.activeValidatorSet.activeVotingPower)
state.activeValidatorSet.proposerBlockReward = blockReward
# Save proposer's initial voting power to state for use in end block.
state.activeValidatorSet.proposerInitialVotingPower = proposer.votingPower
state.activeValidatorSet[block.header.proposerAddress] = proposer
End Block
Apply the following to the state:
account = state.accounts[block.header.proposerAddress]
if account.status == AccountStatus.ValidatorUnbonding
account.status == AccountStatus.ValidatorUnbonded
proposer = state.inactiveValidatorSet[block.header.proposerAddress]
else if account.status == AccountStatus.ValidatorBonded
proposer = state.activeValidatorSet[block.header.proposerAddress]
# Flush the outstanding pending rewards.
proposer.latestEntry += compute_new_entry(proposer.pendingRewards, proposer.votingPower)
proposer.pendingRewards = 0
blockReward = state.activeValidatorSet.proposerBlockReward
commissionReward = proposer.commissionRate.numerator * blockReward // proposer.commissionRate.denominator
proposer.commissionRewards += commissionReward
proposer.pendingRewards += blockReward - commissionReward
# Even though the voting power hasn't changed yet, we consider this a period change.
proposer.latestEntry += compute_new_entry(proposer.pendingRewards, state.activeValidatorSet.proposerInitialVotingPower)
proposer.pendingRewards = 0
if account.status == AccountStatus.ValidatorUnbonding
account.status == AccountStatus.ValidatorUnbonded
state.inactiveValidatorSet[block.header.proposerAddress] = proposer
else if account.status == AccountStatus.ValidatorBonded
state.activeValidatorSet[block.header.proposerAddress] = proposer
At the end of a block, the top MAX_VALIDATORS
validators by voting power with voting power greater than zero are or become active (bonded). For newly-bonded validators, the entire validator object is moved to the active validators subtree and their status is changed to bonded. For previously-bonded validators that are no longer in the top MAX_VALIDATORS
validators begin unbonding.
Bonding validators is simply setting their status to AccountStatus.ValidatorBonded
. The logic for validator unbonding is found here, minus transaction sender updates (nonce, balance, and fee).
Finally, the state subtree with ID MESSAGE_PAID_SUBTREE_ID
is deleted.
This end block implicit state transition is a single state transition, and only has a single intermediate state root associated with it.
Honest Block Proposer
This document describes the tasks of an honest block proposer to assemble a new block. Performing these actions is not enforced by the consensus rules, so long as a valid block is produced.
Deciding on a Block Size
Before arranging available data into shares, the size of the original data's square must be determined.
There are two restrictions on the original data's square size:
- It must be at most
AVAILABLE_DATA_ORIGINAL_SQUARE_MAX
. - It must be a power of 2.
With these restrictions in mind, the block proposer performs the following actions:
- Collect as many transactions and messages from the mempool as possible, such that the total number of shares is at most
AVAILABLE_DATA_ORIGINAL_SQUARE_MAX
. - Compute the smallest square size that is a power of 2 that can fit the number of shares.
- Attempt to lay out the collected transactions and messages in the current square.
- If the square is too small to fit all transactions and messages (which may happen due to needing to insert padding between messages) and the square size is smaller than
AVAILABLE_DATA_ORIGINAL_SQUARE_MAX
, double the size of the square and repeat the above step.
- If the square is too small to fit all transactions and messages (which may happen due to needing to insert padding between messages) and the square size is smaller than
Note: the maximum padding shares between messages should be at most twice the number of message shares. Doubling the square size (i.e. quadrupling the number of shares in the square) should thus only have to happen at most once.
Laying out Transactions and Messages
Networking
Wire Format
AvailableData
name | type | description |
---|---|---|
availableDataRows | AvailableDataRow[] | List of rows. |
AvailableDataRow
name | type | description |
---|---|---|
shares | Share[] | Shares in a row. |
ConsensusProposal
Defined as ConsensusProposal
:
message ConsensusProposal {
SignedMsgType type = 1;
int32 round = 2;
int32 pol_round = 3;
// 32-byte hash
// Proposed block header
Header header = 4;
AvailableDataHeader da_header = 5;
// 64-byte signature
bytes proposer_signature = 6;
}
When receiving a new block proposal proposal
from the network, the following steps are performed in order. Must indicates that peers must be blacklisted (to prevent DoS attacks) and should indicates that the network message can simply be ignored.
proposal.type
must be aSignedMsgType
.proposal.round
is processed identically to Tendermint.proposal.pol_round
is processed identically to Tendermint.proposal.header
must be well-formed.proposal.header.version.block
must beVERSION_BLOCK
.proposal.header.version.app
must beVERSION_APP
.proposal.header.height
should be previous known height + 1.proposal.header.chain_id
must beCHAIN_ID
.proposal.header.time
is processed identically to Tendermint.proposal.header.last_header_hash
must be previous block's header hash.proposal.header.last_commit_hash
must be the previous block's commit hash.proposal.header.consensus_hash
must be the hash of consensus parameters.proposal.header.state_commitment
must be the state root after applying the previous block's transactions.proposal.header.available_data_original_shares_used
must be at mostAVAILABLE_DATA_ORIGINAL_SQUARE_MAX ** 2
.proposal.header.available_data_root
must be the root ofproposal.da_header
.proposal.header.proposer_address
must be the correct leader.proposal.da_header
must be well-formed.- The number of elements in
proposal.da_header.row_roots
andproposal.da_header.row_roots
must be equal. - The number of elements in
proposal.da_header.row_roots
must be the same as computed here. proposal.proposer_signature
must be a valid digital signature over the header hash ofproposal.header
that recovers toproposal.header.proposer_address
.- For full nodes,
proposal.da_header
must be the result of computing the roots of the shares (received separately). - For light nodes,
proposal.da_header
should be sampled from for availability.
MsgWirePayForData
Defined as MsgWirePayForData
:
message MsgWirePayForData {
TransactionFee fee = 1;
uint64 nonce = 2;
// 8-byte namespace ID
bytes message_namespace_id = 3;
uint64 message_size = 4;
bytes message = 5;
repeated MessageCommitmentAndSignature message_commitment_and_signature = 6;
}
Accepting a MsgWirePayForData
into the mempool requires different logic than other transactions in Celestia, since it leverages the paradigm of block proposers being able to malleate transaction data. Unlike SignedTransactionDataMsgPayForData (the canonical data type that is included in blocks and committed to with a data root in the block header), each MsgWirePayForData
(the over-the-wire representation of the same) has potentially multiple signatures.
Transaction senders who want to pay for a message will create a SignedTransactionDataMsgPayForData object, stx
, filling in the stx.messageShareCommitment
field based on the non-interactive default rules for k = AVAILABLE_DATA_ORIGINAL_SQUARE_MAX
, then signing it to get a transaction tx
. This process is repeated with successively smaller k
s, decreasing by powers of 2 until k * k <= stx.messageSize
. At that point, there would be insufficient shares to include both the message and transaction. Using the rest of the signed transaction data along with the pairs of (tx.signedTransactionData.messageShareCommitment, tx.signature)
, a MsgWirePayForData
object is constructed.
Receiving a MsgWirePayForData
object from the network follows the reverse process: for each message_commitment_and_signature
, verify using the non-interactive default rules that the signature is valid.
Invalid Erasure Coding
If a malicious block producer incorrectly computes the 2D Reed-Solomon code for a block's data, a fraud proof for this can be presented. We assume that the light clients have the AvailableDataHeader and the Header for each block. Hence, given a ShareProof, they can verify if the rowRoot
or colRoot
specified by isCol
and position
commits to the corresponding Share. Similarly, given the height
of a block, they can access all elements within the AvailableDataHeader and the Header of the block.
ShareProof
name | type | description |
---|---|---|
share | Share | The share. |
proof | NamespaceMerkleTreeInclusionProof | The Merkle proof of the share in the offending row or column root. |
isCol | bool | A Boolean indicating if the proof is from a row root or column root; false if it is a row root. |
position | uint64 | The index of the share in the offending row or column. |
BadEncodingFraudProof
Defined as BadEncodingFraudProof
:
// https://github.com/celestiaorg/celestia-specs/blob/master/specs/networking.md#badencodingfraudproof
message BadEncodingFraudProof {
// height of the block with the offending row or column
int64 height = 1;
// the available shares in the offending row or column and their Merkle proofs
// array of ShareProofs
repeated ShareProof shareProofs = 2;
// a Boolean indicating if it is an offending row or column; false if it is a row
bool isCol = 3;
// the index of the offending row or column in the square
uint64 position = 4;
}
name | type | description |
---|---|---|
height | Height | Height of the block with the offending row or column. |
shareProofs | ShareProof[] | The available shares in the offending row or column. |
isCol | bool | A Boolean indicating if it is an offending row or column; false if it is a row. |
position | uint64 | The index of the offending row or column in the square. |
Invalid State Update
If a malicious block producer incorrectly computes the state, a fraud proof for this can be presented. We assume that the light clients have the AvailableDataHeader and the Header for each block. Hence, given a ShareProof, they can verify if the rowRoot
or colRoot
specified by isCol
and position
commits to the corresponding Share. Similarly, given the height
of a block, they can access all elements within the AvailableDataHeader and the Header of the block.
StateFraudProof
Defined as StateFraudProof
:
// https://github.com/celestiaorg/celestia-specs/blob/master/specs/networking.md#statefraudproof
message StateFraudProof {
// height of the block with the intermediate state roots
// Subtracting one from height gives the height of the block with the transactions.
int64 height = 1;
// shares containing the transactions and their Merkle proofs
// isCol within the ShareProof must be false.
// array of ShareProofs
repeated ShareProof transactionShareProofs = 2;
// shares containing the intermediate state roots and their Merkle proofs
// isCol within the ShareProof must be false.
// array of ShareProofs
repeated ShareProof isrShareProofs = 3;
// index for connecting the WrappedIntermediateStateRoot and WrappedTransaction after shares are parsed
uint64 index = 4;
// state elements that were changed by the transactions
// array of StateElements
repeated StateElement intermediateStateElements = 5;
// sparse Merkle tree inclusion proofs for the state elements
// array of SparseMerkleTreeInclusionProofs
repeated SparseMerkleTreeInclusionProof stateInclusionProofs = 6;
}
name | type | description |
---|---|---|
height | Height | Height of the block with the intermediate state roots. Subtracting one from height gives the height of the block with the transactions. |
transactionShareProofs | ShareProof[] | isCol of type bool must be false . |
isrShareProofs | ShareProof[] | isCol of type bool must be false . |
index | uint64 | Index for connecting the WrappedIntermediateStateRoot and WrappedTransaction after shares are parsed. |
intermediateStateElements | StateElement[] | State elements that were changed by the transactions. |
stateInclusionProofs | SparseMerkleTreeInclusionProof[] | SparseMerkleTree inclusion proofs for the state elements. |
Rationale
Message Layout
Preamble
Celestia uses a data availability scheme that allows nodes to determine whether a block's data was published without downloading the whole block. The core of this scheme is arranging data in a two-dimensional matrix then applying erasure coding to each row and column. This document describes the rationale for how data—transactions, messages, and other data—is actually arranged. Familiarity with the originally proposed data layout format is assumed.
Message Layout Rationale
Block data consists of:
- Cosmos SDK module transactions (e.g. MsgSend). These modify the Celestia chain's state.
- Celestia-specific transactions (e.g. PayForBlobs). These modify the Celestia chain's state.
- Intermediate state roots: required for fraud proofs of the aforementioned transactions.
- Messages: binary blobs which do not modify the Celestia state, but which are intended for a Celestia application identified with a provided namespace ID.
We want to arrange this data into a k * k
matrix of fixed-sized shares, which will later be committed to in Namespace Merkle Trees (NMTs).
The simplest way we can imagine arranging block data is to simply serialize it all in no particular order, split it into fixed-sized shares, then arrange those shares into the k * k
matrix in row-major order. However, this naive scheme can be improved in a number of ways, described below.
First, we impose some ground rules:
- Data must be ordered by namespace ID. This makes queries into a NMT commitment of that data more efficient.
- Since non-message data are not naturally intended for particular namespaces, we assign reserved namespaces for them. A range of namespaces is reserved for this purpose, starting from the lowest possible namespace ID.
- By construction, the above two rules mean that non-message data always precedes message data in the row-major matrix, even when considering single rows or columns.
- Data with different namespaces must not be in the same share. This might cause a small amount of wasted block space, but makes the NMT easier to reason about in general since leaves are guaranteed to belong to a single namespace.
Transactions can pay fees for a message to be included in the same block as the transaction itself. However, we do not want serialized transactions to include the entire message they pay for (which is the case in other blockchains with native execution, e.g. calldata in Ethereum transactions or OP_RETURN data in Bitcoin transactions), otherwise every node that validates the sanctity of the Celestia coin would need to download all message data. Transactions must therefore only include a commitment to (i.e. some hash of) the message they pay fees for. If implemented naively (e.g. with a simple hash of the message, or a simple binary Merkle tree root of the message), this can lead to a data availability problem, as there are no guarantees that the data behind these commitments is actually part of the block data.
To that end, we impose some additional rules onto messages only: messages must be placed is a way such that both the transaction sender and the block producer can be held accountable—a necessary property for e.g. fee burning. Accountable in this context means that
- The transaction sender must pay sufficient fees for message inclusion.
- The block proposer cannot claim that a message was included when it was not (which implies that a transaction and the message it pays for must be included in the same block).
Specifically, messages must begin at a new share, unlike non-message data which can span multiple shares. We note a nice property from this rule: if the transaction sender knows 1) k
, the size of the matrix, 2) the starting location of their message in a row, and 3) the length of the message (they know this since they are sending the message), then they can actually compute a sequence of roots to subtrees in the row NMTs. More importantly, anyone can compute this, and can compute the simple Merkle root of these subtree roots.
This, however, requires the block producer to interact with the transaction sender to provide them the starting location of their message. This can be done selectively, but is not ideal as a default for e.g. end-user wallets.
Non-Interactive Default Rules
As a non-consensus-critical default, we can impose one additional rule on message placement to make the possible starting locations of messages sufficiently predictable and constrained such that users can deterministically compute subtree roots without interaction:
Messages start at an index that is a multiple of the message minimum square size. The message minimum square size is the smallest square that can contain the message in isolation (i.e. a square with only this message and no other transactions or messages).
In the constraint mentioned above, the number of rows/columns in the minimum square size should be a power of 2.
With the above constraint, we can compute subtree roots deterministically. In order to compute the subtree roots, split the message into chunks that are of maximum size: message minimum square size. As an example, a message of length 11
has a minimum square size of 4
because 11
is not greater than 4 * 4 = 16
total shares. Split the message into chunks of length 4, 4, 2, 1
. The resulting slices are the leaves of subtrees whose roots can be computed. These subtree roots will be present as internal nodes in the NMT of some row(s).
This is similar to Merkle Mountain Ranges, though with the largest subtree bounded by the message minimum square size rather than being unbounded.
The last piece of the puzzle is determining which row the message is placed at (or, more specifically, the starting location). This is needed to keep the block producer accountable. To this end, the block producer simply augments each fee-paying transaction with some metadata: the starting location of the message the transaction pays for.
Caveats
The message placement rules described above conflict with the first rule that shares must be ordered by namespace ID, as shares between two messages that are not placed adjacent to each other do not have a natural namespace they belong to. This is resolved by requiring that such shares have a value of zero and a namespace ID equal to the preceding message's. Since their value is known, they can be omitted from NMT proofs of all shares of a given namespace ID.