Data availability sampling (DAS) is critical to scaling blockchains while maintaining decentralization [ASB18,HASW23]. In our previous post, we informally introduced DAS and showed how to leverage interaction to reduce overhead for a promising construction based on the FRI protocol. Our newest work, ZODA, short for ‘Zero-Overhead Data Availability,’ takes these ideas a step further. In particular, we show that, with a small change, an extended data square construction, such as that used in Celestia, can be turned into a (validity) proof of its own correctness. Importantly, the change incurs negligible additional work for encoders and no incremental communication costs for the network. This post is an informal introduction to the ZODA protocol.
Basics of encoding. To explain ZODA, we briefly sketch how DAS protocols work today. In production, DAS protocols, including those contemplated by Ethereum and implemented by Celestia, block data is organized into a matrix

Sampling. Light nodes then sample entries from this extended data square,

Encoding correctness. However, if
Fraud-proof construction. The first idea, initially proposed in [ASB18] and now used in the Celestia protocol, is for full nodes to download all of

This approach assumes that each light node is connected to at least one honest full node that can furnish a fraud proof in the event of a bad encoding. While there is zero prover overhead, each full node needs to download the entire data and re-compute the extended data square
2D KZG construction. Another approach, initially proposed in [But20], is to commit to each row and column of
Drawbacks. While both approaches to proving the correctness of
Other systems. Recently, hashing-based proof systems have emerged as a potential alternative for DAS thanks to the works of [HASW23,HASW24]. These constructions feature low prover overhead, don’t require a trusted setup, and are plausibly post-quantum secure. However, they impose extremely high overheads on the network by requiring each light node to redundantly download a (large) non-interactive proof of proximity. In our previous post, we showed how to leverage interaction to reduce this overhead substantially. ZODA takes this idea to its logical extreme: what if we leveraged interaction to the maximum extent possible to reduce the proof overhead to zero?
ZODA. The main idea for ZODA is simple. In fact, ZODA looks almost identical to the fraud-proof-based construction we presented in the first figure, with one small tweak. As before, we start by encoding the columns of

Surprisingly, this tweak makes the encoding provably correct! Specifically, nodes sample random rows and columns and check that these are valid codewords which are also consistent at their intersection. Repeating this procedure for enough rows and columns establishes two properties with high probability. First, the matrix

Figure 1: ZODA verification. A light node verifies that a row and column are valid codewords and are consistent at their intersection (marked with a green checkmark).
Computational overhead. The construction also has essentially zero prover overhead. This may seem counter-intuitive: usually, when considering general computation, going from a fraud-proof to a validity-proof system (for example, switching from an optimistic to a ZK rollup) imposes a few orders of magnitude of prover overhead. Indeed, if we were to prove the construction of
Conclusion. What are the implications in practice? The most obvious points relate to increasing scalability and lowering trust assumptions. It is possible to run ZODA with cryptographic security (soundness
Acknowledgments
We’d like to deeply thank John Adler, Dev Ojha, and Mustafa Al-Bassam for their time and all of the helpful conversations regarding Celestia, data availability sampling, and applications of ZODA to consensus and scaling, which were invaluable in writing this paper. We would also like to thank Kobi Gurkan for helpful edits and suggestions to this post.
Citations
[AHIV17] Scott Ames, Carmit Hazay, Yuval Ishai, and Muthuramakrishnan Venkitasubramaniam. Ligero: Lightweight sublinear arguments without a trusted setup. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 2087–2104, Dallas Texas USA, October 2017. ACM.
[ASB18] Mustafa Al-Bassam, Alberto Sonnino, and Vitalik Buterin. Fraud proofs: Maximising light client security and scaling blockchains with dishonest majorities. CoRR, abs/1809.09044, 2018.
[But20] Vitalik Buterin. 2d data availability with kate commitments, 2020. Accessed: 14 September 2024.
[HASW23] Mathias Hall-Andersen, Mark Simkin, and Benedikt Wagner. Foundations of data availability sampling. Cryptology ePrint Archive, Paper 2023/1079, 2023.
[HASW24] Mathias Hall-Andersen, Mark Simkin, and Benedikt Wagner. FRIDA: Data availability sampling from FRI. Cryptology ePrint Archive, Paper 2024/248, 2024.
[KZG10] Aniket Kate, Gregory M. Zaverucha, and Ian Goldberg. Constant-size commitments to polynomials and their applications. In Masayuki Abe, editor, Advances in Cryptology – ASIACRYPT 2010, pages 177–194, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.
[Val24] ValarDragon. Use DAS to speedup consensus throughput, 2024. Accessed: 2024-11-09.