Description

Casky is a lightweight, crash-safe key-value store written in C, designed for fast storage and retrieval of data with a minimal footprint. Built using Test-Driven Development (TDD), Casky ensures reliability while keeping the codebase clean and maintainable. It is inspired by Bitcask and aims to provide a simple, embeddable storage engine that can be integrated into microservices, IoT devices, and other C-based applications.

Objectives:

  • Implement a minimal key-value store with append-only file storage.
  • Support crash-safe persistence and recovery.
  • Expose a simple public API: store(key, value), load(key), delete(key).
  • Follow TDD methodology for robust and testable code.
  • Provide a foundation for future extensions, such as in-memory caching, compaction, and eventual integration with vector-based databases like PixelDB.

Why This Project is Interesting:

Casky combines low-level C programming with modern database concepts, making it an ideal playground to explore storage engines, crash safety, and performance optimization. It’s small enough to complete during Hackweek, yet it provides a solid base for future experiments and more complex projects.

Goals

  • Working prototype with append-only storage and memtable.
  • TDD test suite covering core functionality and recovery.
  • Demonstration of basic operations: insert, load, delete.
  • Optional bonus: LRU caching, file compaction, performance benchmarks.

Future Directions:

After Hackweek, Casky can evolve into a backend engine for projects like PixelDB, supporting vector storage and approximate nearest neighbor search, combining low-level performance with cutting-edge AI retrieval applications.

Resources

The Bitcask paper: https://riak.com/assets/bitcask-intro.pdf The Casky repository: https://github.com/thesp0nge/casky

Looking for hackers with the skills:

database

This project is part of:

Hack Week 25

Activity

  • 10 days ago: wfrisch liked this project.
  • 12 days ago: pperego added keyword "database" to this project.
  • 12 days ago: pperego originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Time-travelling topology on the Rocks by fvanlankvelt

    Description

    The current implementation of the Time-Travelling Topology database, StackGraph, has served SUSE Observability well over the years. But it is dependent on a number of complex components - Zookeeper, HDFS, HBase, Tephra. These lead to a large number of failure scenarios and parameters to tweak for optimal performance.

    The goal of this project is to take the high-level requirements (time-travelling topology, querying over time, transactional changes to topology, scalability) and design/prototype key components, to see where they would lead us if we were to start from scratch today.

    An example would be to use Kafka Streams to consolidate topology history (and its index) in sharded RocksDB key-value stores (native to stateful stream processors). A distributed transaction manager (DTM) should also be possible, by using a single Kafka partition for atomic writes.

    Persistence with RocksDB would allow time travelling by using the merge operator.

    Goals

    Determine feasibility of implementing the model on a whole new architecture. E.g. a proof of concept for a DTM, find out how hard it is to do querying over time (merge operator?), howto route fetch requests to the correct instance, etcetera.

    Resources

    Backend developers, preferably experienced in distributed systems / stream processing. Programming language: scala 3 with some C++ for low-level.