v0.4 Herrenberg
Erasure Coding
where erasure coding fits in herrenberg erasure coding in xandeum's herrenberg release (v0 4 0) enables redundant, fault tolerant data storage by splitting data into shards with parity info, allowing full reconstruction even if some shards are lost it integrates with the new gossip protocol to distribute these shards across pnode "pods," ensuring tamper proof, scalable storage coordination erasure coding is directly involved with the gossip protocol, as the protocol uses it to distribute data shards among pods for resilient, global scale storage a core outcome of erasure coding is full data reconstructability, enabling 100% data recovery from distributed shards, even in failures what is erasure coding? erasure coding is a data protection technique that adds redundancy to original data by splitting it into fragments (shards), encoding them with parity information, and distributing them across multiple locations this allows reconstruction of the full data even if some shards are lost or corrupted, similar to raid systems but more efficient for distributed networks like blockchains it's based on error correcting codes (e g , reed solomon) and provides fault tolerance without full duplication, balancing storage costs and reliability erasure coding in xandeum xandeum uses erasure coding to enable scalable, redundant storage on solana, splitting data into pages and distributing encoded shards across pnodes (provider nodes) for exabyte level capacity with blockchain grade integrity here's a breakdown data splitting and encoding data is divided into fixed size pages (e g , optimized for solana's transaction limits), then encoded using erasure codes like reed solomon generalizations this creates data shards plus parity shards, allowing reconstruction from a subset (e g , tolerate up to a configurable number of failures) configurable redundancy users set redundancy levels (e g , 1 5x to 3x overhead), determining how many extra parity shards are generated this stores data on a subset of pnodes rather than all validators, optimizing costs while ensuring availability—e g , recover from 30 50% node failures without data loss distribution across pnodes encoded shards are spread across decentralized pnodes in "pods," using gossip protocols for coordination this offloads storage from solana validators (vnodes), which supervise via cryptographic proofs security and integrity pnodes generate merkle proofs for tamper proofing; vnodes verify using threshold signature schemes (tss) combined with solana's poh, this ensures data can't be altered undetected availability and reconstruction supports primitives like poke (store), peek (retrieve), and prove (verify) data is highly available without full replication; e g , reconstruct from any sufficient shards, even in partial network failures integration with solana erasure coding ties into solana transactions for on chain access, with fees in sol funding the ecosystem it's designed for sedapps (storage enabled dapps), enabling low latency operations at scale for implementation code or deeper math, refer to xandeum's github repos or whitepaper for reed solomon specifics