Description

To achieve this it will be necessary:

Collect/download raw data from various government and non-governmental organizations
Clean up raw data and organise it in some kind database.
Create tool to make queries easy.
Or perhaps dump all data into AI and ask questions in natural language.

Goals

By selecting particular school information like this will be provided:

School scores on national exams.
School scores from the external evaluations exams.
School town, municipality and region.
Employment rate in a town or municipality.
Average health of the population in the region.

Resources

Some of these are available only in bulgarian.

https://danybon.com/klasazia
https://nvoresults.com/index.html
https://ri.mon.bg/active-institutions
https://www.nsi.bg/nrnm/ekatte/archive

Results

Information about all Bulgarian schools with their scores during recent years cleaned and organised into SQL tables
Information about all Bulgarian villages, cities, municipalities and districts cleaned and organised into SQL tables
Information about all Bulgarian villages and cities census since beginning of this century cleaned and organised into SQL tables.
Information about all Bulgarian municipalities about religion, ethnicity cleaned and organised into SQL tables.
Data successfully loaded to locally running Ollama with help to Vanna.AI
Seems to be usable.

TODO

Add more statistical information about municipalities and ....

Code and data

Github

Looking for hackers with the skills:

python database flask

This project is part of:

Hack Week 25

Activity

about 1 month ago: duwe liked this project.

about 1 month ago: sndirsch liked this project.

about 2 months ago: mkoutny liked this project.

about 2 months ago: rtsvetkov liked this project.

about 2 months ago: iivanov added keyword "python" to this project.

about 2 months ago: iivanov added keyword "database" to this project.

about 2 months ago: iivanov added keyword "flask" to this project.

about 2 months ago: iivanov started this project.

about 2 months ago: iivanov originated this project.

Comments

about 2 months ago by rtsvetkov | Reply

I'm really excited to see some results... Even raw and preliminary
- about 2 months ago by iivanov | Reply
  
  Initial version is ready. Don't expect too much. It is somehow usable. For better results use queries in Bulgarian language ;-) like:
  - Колко общини има в България?
  - Коя е най-малката от тях през 2005 година?
  - Коя учебна институция има най-добър резултат от изпитите по математика през 2024 година?
  - В кое населено място се намира?
  - Колко е голямо? ...

Similar Projects

python

Help Create A Chat Control Resistant Turnkey Chatmail/Deltachat Relay Stack - Rootless Podman Compose, OpenSUSE BCI, Hardened, & SELinux by 3nd5h1771fy

Description

The Mission: Decentralized & Sovereign Messaging

FYI: If you have never heard of "Chatmail", you can visit their site here, but simply put it can be thought of as the underlying protocol/platform decentralized messengers like DeltaChat use for their communications. Do not confuse it with the honeypot looking non-opensource paid for prodect with better seo that directs you to chatmailsecure(dot)com

In an era of increasing centralized surveillance by unaccountable bad actors (aka BigTech), "Chat Control," and the erosion of digital privacy, the need for sovereign communication infrastructure is critical. Chatmail is a pioneering initiative that bridges the gap between classic email and modern instant messaging, offering metadata-minimized, end-to-end encrypted (E2EE) communication that is interoperable and open.

However, unless you are a seasoned sysadmin, the current recommended deployment method of a Chatmail relay is rigid, fragile, difficult to properly secure, and effectively takes over the entire host the "relay" is deployed on.

Why This Matters

A simple, host agnostic, reproducible deployment lowers the entry cost for anyone wanting to run a privacy‑preserving, decentralized messaging relay. In an era of perpetually resurrected chat‑control legislation threats, EU digital‑sovereignty drives, and many dangers of using big‑tech messaging platforms (Apple iMessage, WhatsApp, FB Messenger, Instagram, SMS, Google Messages, etc...) for any type of communication, providing an easy‑to‑use alternative empowers:

Censorship resistance - No single entity controls the relay; operators can spin up new nodes quickly.
Surveillance mitigation - End‑to‑end OpenPGP encryption ensures relay operators never see plaintext.
Digital sovereignty - Communities can host their own infrastructure under local jurisdiction, aligning with national data‑policy goals.

By turning the Chatmail relay into a plug‑and‑play container stack, we enable broader adoption, foster a resilient messaging fabric, and give developers, activists, and hobbyists a concrete tool to defend privacy online.

Goals

As I indicated earlier, this project aims to drastically simplify the deployment of Chatmail relay. By converting this architecture into a portable, containerized stack using Podman and OpenSUSE base container images, we can allow anyone to deploy their own censorship-resistant, privacy-preserving communications node in minutes.

Our goal for Hack Week: package every component into containers built on openSUSE/MicroOS base images, initially orchestrated with a single container-compose.yml (podman-compose compatible). The stack will:

Run on any host that supports Podman (including optimizations and enhancements for SELinux‑enabled systems).
Allow network decoupling by refactoring configurations to move from file-system constrained Unix sockets to internal TCP networking, allowing containers achieve stricter isolation.
Utilize Enhanced Security with SELinux by using purpose built utilities such as udica we can quickly generate custom SELinux policies for the container stack, ensuring strict confinement superior to standard/typical Docker deployments.
Allow the use of bind or remote mounted volumes for shared data (/var/vmail, DKIM keys, TLS certs, etc.).
Replace the local DNS server requirement with a remote DNS‑provider API for DKIM/TXT record publishing.

By delivering a turnkey, host agnostic, reproducible deployment, we lower the barrier for individuals and small communities to launch their own chatmail relays, fostering a decentralized, censorship‑resistant messaging ecosystem that can serve DeltaChat users and/or future services adopting this protocol

Resources

Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

Description

git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

Supports:

Verifying commit authenticity signed using gpg key
Checking out trusted commits

Ideal for teams and projects where the integrity of git history is crucial.

Goals

A minimal python code of the shell script exists as a pull request.

The goal of this hackweek is to:

DONE: Add more unit tests
- New and more tests can be added later
Partially DONE: Make the python code modular
DONE: Add code coverage if possible

Resources

Link to GitHub Repository: https://github.com/openSUSE/git-sha-verify

Bring to Cockpit + System Roles capabilities from YAST by miguelpc

Bring to Cockpit + System Roles features from YAST

Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.

Goals

The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.

Resources

A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit

Linux System Roles:

https://linux-system-roles.github.io/
https://build.opensuse.org/package/show/openSUSE:Factory/ansible-linux-system-roles Package on sle16 ansible-linux-system-roles

First meeting Hackweek catchup

Monday, December 1 · 11:00 – 12:00
Time zone: Europe/Madrid
Google Meet link: https://meet.google.com/rrc-kqch-hca

Update M2Crypto by mcepl

There are couple of projects I work on, which need my attention and putting them to shape:

M2Crypto

Goal for this Hackweek

Put M2Crypto into better shape (most issues closed, all pull requests processed)
More fun to learn jujutsu
Play more with Gemini, how much it help (or not).
Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.

Improve chore and screen time doc generator script `wochenplaner` by gniebler

Description

I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.

I named this script wochenplaner and have been using it for a few months now.

It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.

Goals

Fix chore field separation lines
Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.

Resources

tbd (Gitlab repo)

database

Uyuni read-only replica by cbosdonnat

Description

For now, there is no possible HA setup for Uyuni. The idea is to explore setting up a read-only shadow instance of an Uyuni and make it as useful as possible.

Possible things to look at:

live sync of the database, probably using the WAL. Some of the tables may have to be skipped or some features disabled on the RO instance (taskomatic, PXT sessions…)
Can we use a load balancer that routes read-only queries to either instance and the other to the RW one? For example, packages or PXE data can be served by both, the API GET requests too. The rest would be RW.

Goals

Prepare a document explaining how to do it.
PR with the needed code changes to support it

Work on kqlite (Lightweight remote SQLite with high availability and auto failover). by epenchev

Description

Continue the work on kqlite (Lightweight remote SQLite with high availability and auto failover).
It's a solution for applications that require High Availability but don't need all the features of a complete RDBMS and can fit SQLite in their use case.
Also kqlite can be considered to be used as a lightweight storage backend for K8s (https://docs.k3s.io/datastore) and the Edge, and allowing to have only 2 Nodes for HA.

Goals

Push kqlite to a beta version.
kqlite as library for Go programs.

Resources

https://github.com/kqlite/kqlite

GRIT: GRaphs In Time by fvanlankvelt

Description

The current implementation of the Time-Travelling Topology database, StackGraph, has served SUSE Observability well over the years. But it is dependent on a number of complex components - Zookeeper, HDFS, HBase, Tephra. These lead to a large number of failure scenarios and parameters to tweak for optimal performance.

The goal of this project is to take the high-level requirements (time-travelling topology, querying over time, transactional changes to topology, scalability) and design/prototype key components, to see where they would lead us if we were to start from scratch today.

An example would be to use RocksDB to persist topology history. Its user-defined timestamps seem to match well with time-travelling, has transaction support with fine-grained conflict detection.

Goals

Determine feasibility of implementing the model on a whole new architecture. See how to model the graph and its history such that updates and querying are performant, transactional conflicts are minimized. Build a prototype to validate the model.

Resources

Backend developers, preferably experienced in distributed systems. Programming language: scala 3 with some C++ for low-level.

Progress

The project has started at github GRaphs In Time - a C++ project that

embeds RocksDB for persistence,
uses (nu)Raft for replication/consensus,
supports large transactions, with
SNAPSHOT isolation
time-travel
graph operations (add/remove vertex/edge, indexes)

Casky – Lightweight C Key-Value Engine with Crash Recovery by pperego

Description

Casky is a lightweight, crash-safe key-value store written in C, designed for fast storage and retrieval of data with a minimal footprint. Built using Test-Driven Development (TDD), Casky ensures reliability while keeping the codebase clean and maintainable. It is inspired by Bitcask and aims to provide a simple, embeddable storage engine that can be integrated into microservices, IoT devices, and other C-based applications.

Objectives:

Implement a minimal key-value store with append-only file storage.
Support crash-safe persistence and recovery.
Expose a simple public API: store(key, value), load(key), delete(key).
Follow TDD methodology for robust and testable code.
Provide a foundation for future extensions, such as in-memory caching, compaction, and eventual integration with vector-based databases like PixelDB.

Why This Project is Interesting:

Casky combines low-level C programming with modern database concepts, making it an ideal playground to explore storage engines, crash safety, and performance optimization. It’s small enough to complete during Hackweek, yet it provides a solid base for future experiments and more complex projects.

Goals

Working prototype with append-only storage and memtable.
TDD test suite covering core functionality and recovery.
Demonstration of basic operations: insert, load, delete.
Optional bonus: LRU caching, file compaction, performance benchmarks.

Future Directions:

After Hackweek, Casky can evolve into a backend engine for projects like PixelDB, supporting vector storage and approximate nearest neighbor search, combining low-level performance with cutting-edge AI retrieval applications.

Resources

The Bitcask paper: https://riak.com/assets/bitcask-intro.pdf The Casky repository: https://github.com/thesp0nge/casky

Day 1

[0.10.0] - 2025-12-01

Added

Core in-memory KeyDir and EntryNode structures
API functions: caskyopen, caskyclose, caskyput, caskyget, casky_delete
Hash function: caskydjb2hash_xor
Error handling via casky_errno
Unit tests for all APIs using standard asserts
Test cleanup of temporary files

Changed

None (first MVP)

Fixed

None (first MVP)

Day 2

[0.20.0] - 2025-12-02

Sim racing track database by avicenzi

Description

Do you wonder which tracks are available in each sim racing game? Wonder no more.

Goals

Create a simple website that includes details about sim racing games.

The website should be static and built with Alpine.JS and TailwindCSS. Data should be consumed from JSON, easily done with Alpine.JS.

The main goal is to gather track information, because tracks vary by game. Older games might have older layouts, and newer games might have up-to-date layouts. Some games include historical layouts, some are laser scanned. Many tracks are available as DLCs.

Initially include official tracks from:

These games have a short list of tracks and DLCs.

Resources

The hardest part is collecting information about tracks in each game. Active games usually have information on their website or even on Steam. Older games might be on Fandom or a Wiki. Real track information can be extracted from Wikipedia or the track website.

Description

Goals

Resources

Results

TODO

Code and data

Looking for hackers with the skills:

This project is part of:

Activity

Comments

about 2 months ago by rtsvetkov | Reply

about 2 months ago by iivanov | Reply

Similar Projects

python

Help Create A Chat Control Resistant Turnkey Chatmail/Deltachat Relay Stack - Rootless Podman Compose, OpenSUSE BCI, Hardened, & SELinux by 3nd5h1771fy

Description

The Mission: Decentralized & Sovereign Messaging

Why This Matters

Goals

Resources

Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

Description

Goals

Resources

Bring to Cockpit + System Roles capabilities from YAST by miguelpc

Bring to Cockpit + System Roles features from YAST

Goals

Resources

Update M2Crypto by mcepl

Goal for this Hackweek

Improve chore and screen time doc generator script `wochenplaner` by gniebler

Description

Goals

Resources

database

Uyuni read-only replica by cbosdonnat

Description

Goals

Work on kqlite (Lightweight remote SQLite with high availability and auto failover). by epenchev

Description

Goals

Resources

GRIT: GRaphs In Time by fvanlankvelt

Description

Goals

Resources

Progress

Casky – Lightweight C Key-Value Engine with Crash Recovery by pperego

Description

Objectives:

Why This Project is Interesting:

Goals

Future Directions:

Resources

Day 1

[0.10.0] - 2025-12-01

Added

Changed

Fixed

Day 2

[0.20.0] - 2025-12-02

Sim racing track database by avicenzi

Description

Goals

Resources