Description

To achieve this it will be necessary:

  • Collect/download raw data from various government and non-governmental organizations
  • Clean up raw data and organise it in some kind database.
  • Create tool to make queries easy.
  • Or perhaps dump all data into AI and ask questions in natural language.

Goals

By selecting particular school information like this will be provided:

  • School scores on national exams.
  • School scores from the external evaluations exams.
  • School town, municipality and region.
  • Employment rate in a town or municipality.
  • Average health of the population in the region.

Resources

Some of these are available only in bulgarian.

  • https://danybon.com/klasazia
  • https://nvoresults.com/index.html
  • https://ri.mon.bg/active-institutions
  • https://www.nsi.bg/nrnm/ekatte/archive

Results

  • Information about all Bulgarian schools with their scores during recent years cleaned and organised into SQL tables
  • Information about all Bulgarian villages, cities, municipalities and districts cleaned and organised into SQL tables
  • Information about all Bulgarian villages and cities census since beginning of this century cleaned and organised into SQL tables.
  • Information about all Bulgarian municipalities about religion, ethnicity cleaned and organised into SQL tables.
  • Data successfully loaded to locally running Ollama with help to Vanna.AI
  • Seems to be usable.

TODO

  • Add more statistical information about municipalities and ....

Code and data

Looking for hackers with the skills:

python database flask

This project is part of:

Hack Week 25

Activity

  • 11 days ago: duwe liked this project.
  • 11 days ago: sndirsch liked this project.
  • 20 days ago: mkoutny liked this project.
  • 20 days ago: rtsvetkov liked this project.
  • 20 days ago: iivanov added keyword "python" to this project.
  • 20 days ago: iivanov added keyword "database" to this project.
  • 20 days ago: iivanov added keyword "flask" to this project.
  • 20 days ago: iivanov started this project.
  • 20 days ago: iivanov originated this project.

  • Comments

    • rtsvetkov
      20 days ago by rtsvetkov | Reply

      I'm really excited to see some results... Even raw and preliminary

      • iivanov
        12 days ago by iivanov | Reply

        Initial version is ready. Don't expect too much. It is somehow usable. For better results use queries in Bulgarian language ;-) like:

        • Колко общини има в България?
        • Коя е най-малката от тях през 2005 година?
        • Коя учебна институция има най-добър резултат от изпитите по математика през 2024 година?
        • В кое населено място се намира?
        • Колко е голямо? ...

    Similar Projects

    Song Search with CLAP by gcolangiuli

    Description

    Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

    SUSE Hackweek AI Song Search

    Goals

    Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

    • Music Tagging;
    • Free text search;
    • Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

    The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

    Result

    In this MVP we implemented:

    • Async Song Analysis with Clap model
    • Free Text Search of the songs
    • Similar song search based on vector representation
    • Containerised version with web interface

    We also documented what went well and what can be improved in the use of AI.

    You can have a look at the result here:

    Future implementation can be related to performance improvement and stability of the analysis.

    References


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

    Description

    git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

    Supports:

    • Verifying commit authenticity signed using gpg key
    • Checking out trusted commits

    Ideal for teams and projects where the integrity of git history is crucial.

    Goals

    A minimal python code of the shell script exists as a pull request.

    The goal of this hackweek is to:

    • DONE: Add more unit tests
      • New and more tests can be added later
    • Partially DONE: Make the python code modular
    • DONE: Add code coverage if possible

    Resources


    Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil

    Join the Gitter channel! https://gitter.im/uyuni-project/hackweek

    Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!

    Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.

    For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.

    No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)

    The idea is testing Salt (including bootstrapping with bootstrap script) and Salt-ssh clients

    To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):

    1. Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    2. Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    3. Package management (install, remove, update...)
    4. Patching
    5. Applying any basic salt state (including a formula)
    6. Salt remote commands
    7. Bonus point: Java part for product identification, and monitoring enablement
    8. Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    9. Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
    10. Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)

    If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)

    • If you don't have knowledge about some of the steps: ask the team
    • If you still don't know what to do: switch to another distribution and keep testing.

    This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)

    In progress/done for Hack Week 25

    Guide

    We started writin a Guide: Adding a new client GNU Linux distribution to Uyuni at https://github.com/uyuni-project/uyuni/wiki/Guide:-Adding-a-new-client-GNU-Linux-distribution-to-Uyuni, to make things easier for everyone, specially those not too familiar wht Uyuni or not technical.

    openSUSE Leap 16.0

    The distribution will all love!

    https://en.opensuse.org/openSUSE:Roadmap#DRAFTScheduleforLeap16.0

    Curent Status We started last year, it's complete now for Hack Week 25! :-D

    • [W] Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file) NOTE: Done, client tools for SLMicro6 are using as those for SLE16.0/openSUSE Leap 16.0 are not available yet
    • [W] Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    • [W] Package management (install, remove, update...). Works, even reboot requirement detection


    Casky – Lightweight C Key-Value Engine with Crash Recovery by pperego

    Description

    Casky is a lightweight, crash-safe key-value store written in C, designed for fast storage and retrieval of data with a minimal footprint. Built using Test-Driven Development (TDD), Casky ensures reliability while keeping the codebase clean and maintainable. It is inspired by Bitcask and aims to provide a simple, embeddable storage engine that can be integrated into microservices, IoT devices, and other C-based applications.

    Objectives:

    • Implement a minimal key-value store with append-only file storage.
    • Support crash-safe persistence and recovery.
    • Expose a simple public API: store(key, value), load(key), delete(key).
    • Follow TDD methodology for robust and testable code.
    • Provide a foundation for future extensions, such as in-memory caching, compaction, and eventual integration with vector-based databases like PixelDB.

    Why This Project is Interesting:

    Casky combines low-level C programming with modern database concepts, making it an ideal playground to explore storage engines, crash safety, and performance optimization. It’s small enough to complete during Hackweek, yet it provides a solid base for future experiments and more complex projects.

    Goals

    • Working prototype with append-only storage and memtable.
    • TDD test suite covering core functionality and recovery.
    • Demonstration of basic operations: insert, load, delete.
    • Optional bonus: LRU caching, file compaction, performance benchmarks.

    Future Directions:

    After Hackweek, Casky can evolve into a backend engine for projects like PixelDB, supporting vector storage and approximate nearest neighbor search, combining low-level performance with cutting-edge AI retrieval applications.

    Resources

    The Bitcask paper: https://riak.com/assets/bitcask-intro.pdf The Casky repository: https://github.com/thesp0nge/casky

    Day 1

    [0.10.0] - 2025-12-01

    Added

    • Core in-memory KeyDir and EntryNode structures
    • API functions: caskyopen, caskyclose, caskyput, caskyget, casky_delete
    • Hash function: caskydjb2hash_xor
    • Error handling via casky_errno
    • Unit tests for all APIs using standard asserts
    • Test cleanup of temporary files

    Changed

    • None (first MVP)

    Fixed

    • None (first MVP)

    Day 2

    [0.20.0] - 2025-12-02


    GRIT: GRaphs In Time by fvanlankvelt

    Description

    The current implementation of the Time-Travelling Topology database, StackGraph, has served SUSE Observability well over the years. But it is dependent on a number of complex components - Zookeeper, HDFS, HBase, Tephra. These lead to a large number of failure scenarios and parameters to tweak for optimal performance.

    The goal of this project is to take the high-level requirements (time-travelling topology, querying over time, transactional changes to topology, scalability) and design/prototype key components, to see where they would lead us if we were to start from scratch today.

    An example would be to use RocksDB to persist topology history. Its user-defined timestamps seem to match well with time-travelling, has transaction support with fine-grained conflict detection.

    Goals

    Determine feasibility of implementing the model on a whole new architecture. See how to model the graph and its history such that updates and querying are performant, transactional conflicts are minimized. Build a prototype to validate the model.

    Resources

    Backend developers, preferably experienced in distributed systems. Programming language: scala 3 with some C++ for low-level.

    Progress

    The project has started at github GRaphs In Time - a C++ project that

    • embeds RocksDB for persistence,
    • uses (nu)Raft for replication/consensus,
    • supports large transactions, with
    • SNAPSHOT isolation
    • time-travel
    • graph operations (add/remove vertex/edge, indexes)


    Port the classic browser game HackTheNet to PHP 8 by dgedon

    Description

    The classic browser game HackTheNet from 2004 still runs on PHP 4/5 and MySQL 5 and needs a port to PHP 8 and e.g. MariaDB.

    Goals

    • Port the game to PHP 8 and MariaDB 11
    • Create a container where the game server can simply be started/stopped

    Resources

    • https://github.com/nodeg/hackthenet


    Work on kqlite (Lightweight remote SQLite with high availability and auto failover). by epenchev

    Description

    Continue the work on kqlite (Lightweight remote SQLite with high availability and auto failover).
    It's a solution for applications that require High Availability but don't need all the features of a complete RDBMS and can fit SQLite in their use case.
    Also kqlite can be considered to be used as a lightweight storage backend for K8s (https://docs.k3s.io/datastore) and the Edge, and allowing to have only 2 Nodes for HA.

    Goals

    Push kqlite to a beta version.
    kqlite as library for Go programs.

    Resources

    https://github.com/kqlite/kqlite


    Uyuni read-only replica by cbosdonnat

    Description

    For now, there is no possible HA setup for Uyuni. The idea is to explore setting up a read-only shadow instance of an Uyuni and make it as useful as possible.

    Possible things to look at:

    • live sync of the database, probably using the WAL. Some of the tables may have to be skipped or some features disabled on the RO instance (taskomatic, PXT sessions…)
    • Can we use a load balancer that routes read-only queries to either instance and the other to the RW one? For example, packages or PXE data can be served by both, the API GET requests too. The rest would be RW.

    Goals

    • Prepare a document explaining how to do it.
    • PR with the needed code changes to support it