Description
Casky is a lightweight, crash-safe key-value store written in C, designed for fast storage and retrieval of data with a minimal footprint. Built using Test-Driven Development (TDD), Casky ensures reliability while keeping the codebase clean and maintainable. It is inspired by Bitcask and aims to provide a simple, embeddable storage engine that can be integrated into microservices, IoT devices, and other C-based applications.
Objectives:
- Implement a minimal key-value store with append-only file storage.
- Support crash-safe persistence and recovery.
- Expose a simple public API: store(key, value), load(key), delete(key).
- Follow TDD methodology for robust and testable code.
- Provide a foundation for future extensions, such as in-memory caching, compaction, and eventual integration with vector-based databases like PixelDB.
Why This Project is Interesting:
Casky combines low-level C programming with modern database concepts, making it an ideal playground to explore storage engines, crash safety, and performance optimization. It’s small enough to complete during Hackweek, yet it provides a solid base for future experiments and more complex projects.
Goals
- Working prototype with append-only storage and memtable.
- TDD test suite covering core functionality and recovery.
- Demonstration of basic operations: insert, load, delete.
- Optional bonus: LRU caching, file compaction, performance benchmarks.
Future Directions:
After Hackweek, Casky can evolve into a backend engine for projects like PixelDB, supporting vector storage and approximate nearest neighbor search, combining low-level performance with cutting-edge AI retrieval applications.
Resources
The Bitcask paper: https://riak.com/assets/bitcask-intro.pdf The Casky repository: https://github.com/thesp0nge/casky
Day 1
[0.10.0] - 2025-12-01
Added
- Core in-memory KeyDir and EntryNode structures
- API functions: caskyopen, caskyclose, caskyput, caskyget, casky_delete
- Hash function: caskydjb2hash_xor
- Error handling via casky_errno
- Unit tests for all APIs using standard asserts
- Test cleanup of temporary files
Changed
- None (first MVP)
Fixed
- None (first MVP)
No Hackers yet
Looking for hackers with the skills:
This project is part of:
Hack Week 25
Activity
Comments
Be the first to comment!
Similar Projects
GRIT: GRaphs In Time by fvanlankvelt
Description
The current implementation of the Time-Travelling Topology database, StackGraph, has served SUSE Observability well over the years. But it is dependent on a number of complex components - Zookeeper, HDFS, HBase, Tephra. These lead to a large number of failure scenarios and parameters to tweak for optimal performance.
The goal of this project is to take the high-level requirements (time-travelling topology, querying over time, transactional changes to topology, scalability) and design/prototype key components, to see where they would lead us if we were to start from scratch today.
An example would be to use RocksDB to persist topology history. Its user-defined timestamps seem to match well with time-travelling, has transaction support with fine-grained conflict detection.
Goals
Determine feasibility of implementing the model on a whole new architecture. See how to model the graph and its history such that updates and querying are performant, transactional conflicts are minimized. Build a prototype to validate the model.
Resources
Backend developers, preferably experienced in distributed systems. Programming language: scala 3 with some C++ for low-level.
Uyuni read-only replica by cbosdonnat
Description
For now, there is no possible HA setup for Uyuni. The idea is to explore setting up a read-only shadow instance of an Uyuni and make it as useful as possible.
Possible things to look at:
- live sync of the database, probably using the WAL. Some of the tables may have to be skipped or some features disabled on the RO instance (taskomatic, PXT sessions…)
- Can we use a load balancer that routes read-only queries to either instance and the other to the RW one? For example, packages or PXE data can be served by both, the API GET requests too. The rest would be RW.
Goals
- Prepare a document explaining how to do it.
- PR with the needed code changes to support it
Collection and organisation of information about Bulgarian schools by iivanov
Description
To achieve this it will be necessary:
- Collect/download raw data from various government and non-governmental organizations
- Clean up raw data and organise it in some kind database.
- Create tool to make queries easy.
- Or perhaps dump all data into AI and ask questions in natural language.
Goals
By selecting particular school information like this will be provided:
- School scores on national exams.
- School scores from the external evaluations exams.
- School town, municipality and region.
- Employment rate in a town or municipality.
- Average health of the population in the region.
Resources
Some of these are available only in bulgarian.
- https://danybon.com/klasazia
- https://nvoresults.com/index.html
- https://ri.mon.bg/active-institutions
- https://www.nsi.bg/nrnm/ekatte/archive
Work on kqlite (Lightweight remote SQLite with high availability and auto failover). by epenchev
Description
Continue the work on kqlite (Lightweight remote SQLite with high availability and auto failover).
It's a solution for applications that require High Availability but don't need all the features of a complete RDBMS and can fit SQLite in their use case.
Also kqlite can be considered to be used as a lightweight storage backend for K8s (https://docs.k3s.io/datastore) and the Edge, and allowing to have only 2 Nodes for HA.
Goals
Push kqlite to a beta version.
kqlite as library for Go programs.
Resources
https://github.com/kqlite/kqlite