Description

Longhorn aims to manage storage across multiple nodes. RAID5F works well for local disks, using full-stripe writes and XOR parity, whereas sharded volumes are designed to scale out across nodes, with independent units that can be optionally replicated. Studying SPDK's RAID5F module at the code level—by tracing the I/O path, analyzing parity-generation logic, and examining how data and parity are organized across stripes—helps build a foundational understanding of distributed storage mechanics, which in turn clarifies why sharding is a natural approach for cluster-scale volumes.

Many concepts—data distribution, stripe/chunk mapping, full-stripe vs partial-write handling, and rebuild logic—closely mirror the challenges Longhorn will face, so gaining clarity here ensures future designs align with proven SPDK patterns.

Goals

  • Full end-to-end call-chain trace of RAID5F I/O and rebuild flows.
  • Understand stripe mathematics: data/parity layout, parity calculation, and reconstruction logic.

Resources

  • https://github.com/spdk/spdk

Looking for hackers with the skills:

spdk

This project is part of:

Hack Week 25

Activity

  • 5 days ago: chinyahuang added keyword "spdk" to this project.
  • 5 days ago: chinyahuang started this project.
  • 5 days ago: chinyahuang originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    This project is one of its kind!