Description

For now, there is no possible HA setup for Uyuni. The idea is to explore setting up a read-only shadow instance of an Uyuni and make it as useful as possible.

Possible things to look at:

  • live sync of the database, probably using the WAL. Some of the tables may have to be skipped or some features disabled on the RO instance (taskomatic, PXT sessions…)
  • Can we use a load balancer that routes read-only queries to either instance and the other to the RW one? For example, packages or PXE data can be served by both, the API GET requests too. The rest would be RW.

Goals

  • Prepare a document explaining how to do it.
  • PR with the needed code changes to support it

Looking for hackers with the skills:

uyuni ha database postgresql

This project is part of:

Hack Week 25

Activity

  • about 1 month ago: j_renner liked this project.
  • about 1 month ago: e_bischoff liked this project.
  • about 1 month ago: deneb_alpha liked this project.
  • about 1 month ago: epenchev liked this project.
  • about 1 month ago: juliogonzalezgil liked this project.
  • about 1 month ago: ygutierrez liked this project.
  • about 1 month ago: oholecek liked this project.
  • about 1 month ago: oholecek joined this project.
  • about 1 month ago: oscar-barrios liked this project.
  • about 1 month ago: cbosdonnat added keyword "database" to this project.
  • about 1 month ago: cbosdonnat added keyword "postgresql" to this project.
  • about 1 month ago: cbosdonnat added keyword "ha" to this project.
  • about 1 month ago: cbosdonnat added keyword "uyuni" to this project.
  • about 1 month ago: cbosdonnat started this project.
  • about 1 month ago: cbosdonnat originated this project.

  • Comments

    • epenchev
      about 1 month ago by epenchev | Reply

      Hi, I think there are a few solutions that might help.

      Since I'm dealing a lot with HA and databases, would like to share my thoughts.

      One possible solution would be to go with pgpool-II - Scaling PostgreSQL Master-Replica Load Balancing and Automatic Failover.

      Such approach is described very much in details -> https://medium.com/@deysouvik700/scaling-postgresql-with-pgpool-ii-master-replica-load-balancing-and-automatic-failover-091983d4dd9a. In the example architecture the PgPool proxy itself is a single point of failure. The example setup could be extended by adding an additional proxy instance. Both proxy instances could be managed by keepalived + VirtualIP config. Of course there are other resources you can refer to as well.

      Another possible solution which is kind of more automated would be to go with cnpg. This would require however to have a K8s cluster for your statefull PostgreSQL workload. So ideally you would need at least 3 Nodes HA K8s cluster. This is the minimal setup and all 3 nodes should be (control plane + worker roles) otherwise the standard setup will go up to 5 nodes (control planes and additional worker nodes.) With cnpg you can create multiple services (rw, ro, r) within you cluster and point clients to them https://cloudnative-pg.io/documentation/1.27/service_management/ and https://cloudnative-pg.io/documentation/1.27/architecture/.

      Something more experimental that I'm working on recently and hoping to be way easier in operational perspective is https://github.com/kqlite/kqlite. It's a SQLite over the PostgreSQL wire protocol, with support for replication and clustering. However this limits the scope of database functionality down to SQLite only. Unfortunately using any PostgreSQL specific features and data types will not work with kqlite .

      P.S. Also there is plenty of documentation on going with the standard approach patroni + HAProxy + etcd.

      • cbosdonnat
        about 1 month ago by cbosdonnat | Reply

        The first issue will be the replication of the DB itself. Since we have sequences and those are not logically replicated, we will have to check the possible options there.

    Similar Projects

    Enhance setup wizard for Uyuni by PSuarezHernandez

    Description

    This project wants to enhance the intial setup on Uyuni after its installation, so it's easier for a user to start using with it.

    Uyuni currently uses "uyuni-tools" (mgradm) as the installation entrypoint, to trigger the installation of Uyuni in the given host, but does not really perform an initial setup, for instance:

    • user creation
    • adding products / channels
    • generating bootstrap repos
    • create activation keys
    • ...

    Goals

    • Provide initial setup wizard as part of mgradm uyuni installation

    Resources


    Enable more features in mcp-server-uyuni by j_renner

    Description

    I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

    • System operations and infos
    • System groups
    • Maintenance windows
    • Ansible
    • Reporting
    • ...

    At the end of the week I managed to enable basic system group operations:

    • List all system groups visible to the user
    • Create new system groups
    • List systems assigned to a group
    • Add and remove systems from groups

    Goals

    • Set up test environment locally with the MCP server and client + a recent MLM server [DONE]
    • Identify features and use cases offering a benefit with limited effort required for enablement [DONE]
    • Create a PR to the repo [DONE]

    Resources


    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources

    Outcome


    Ansible to Salt integration by vizhestkov

    Description

    We already have initial integration of Ansible in Salt with the possibility to run playbooks from the salt-master on the salt-minion used as an Ansible Control node.

    In this project I want to check if it possible to make Ansible working on the transport of Salt. Basically run playbooks with Ansible through existing established Salt (ZeroMQ) transport and not using ssh at all.

    It could be a good solution for the end users to reuse Ansible playbooks or run Ansible modules they got used to with no effort of complex configuration with existing Salt (or Uyuni/SUSE Multi Linux Manager) infrastructure.

    Goals

    • [v] Prepare the testing environment with Salt and Ansible installed
    • [v] Discover Ansible codebase to figure out possible ways of integration
    • [v] Create Salt/Uyuni inventory module
    • [v] Make basic modules to work with no using separate ssh connection, but reusing existing Salt connection
    • [v] Test some most basic playbooks

    Resources

    GitHub page

    Video of the demo


    Move Uyuni Test Framework from Selenium to Playwright + AI by oscar-barrios

    Description

    This project aims to migrate the existing Uyuni Test Framework from Selenium to Playwright. The move will improve the stability, speed, and maintainability of our end-to-end tests by leveraging Playwright's modern features. We'll be rewriting the current Selenium code in Ruby to Playwright code in TypeScript, which includes updating the test framework runner, step definitions, and configurations. This is also necessary because we're moving from Cucumber Ruby to CucumberJS.

    If you're still curious about the AI in the title, it was just a way to grab your attention. Thanks for your understanding.

    Nah, let's be honest add-emoji AI helped a lot to vibe code a good part of the Ruby methods of the Test framework, moving them to Typescript, along with the migration from Capybara to Playwright. I've been using "Cline" as plugin for WebStorm IDE, using Gemini API behind it.


    Goals

    • Migrate Core tests including Onboarding of clients
    • Improve test reliabillity: Measure and confirm a significant reduction of flakiness.
    • Implement a robust framework: Establish a well-structured and reusable Playwright test framework using the CucumberJS

    Resources


    Casky – Lightweight C Key-Value Engine with Crash Recovery by pperego

    Description

    Casky is a lightweight, crash-safe key-value store written in C, designed for fast storage and retrieval of data with a minimal footprint. Built using Test-Driven Development (TDD), Casky ensures reliability while keeping the codebase clean and maintainable. It is inspired by Bitcask and aims to provide a simple, embeddable storage engine that can be integrated into microservices, IoT devices, and other C-based applications.

    Objectives:

    • Implement a minimal key-value store with append-only file storage.
    • Support crash-safe persistence and recovery.
    • Expose a simple public API: store(key, value), load(key), delete(key).
    • Follow TDD methodology for robust and testable code.
    • Provide a foundation for future extensions, such as in-memory caching, compaction, and eventual integration with vector-based databases like PixelDB.

    Why This Project is Interesting:

    Casky combines low-level C programming with modern database concepts, making it an ideal playground to explore storage engines, crash safety, and performance optimization. It’s small enough to complete during Hackweek, yet it provides a solid base for future experiments and more complex projects.

    Goals

    • Working prototype with append-only storage and memtable.
    • TDD test suite covering core functionality and recovery.
    • Demonstration of basic operations: insert, load, delete.
    • Optional bonus: LRU caching, file compaction, performance benchmarks.

    Future Directions:

    After Hackweek, Casky can evolve into a backend engine for projects like PixelDB, supporting vector storage and approximate nearest neighbor search, combining low-level performance with cutting-edge AI retrieval applications.

    Resources

    The Bitcask paper: https://riak.com/assets/bitcask-intro.pdf The Casky repository: https://github.com/thesp0nge/casky

    Day 1

    [0.10.0] - 2025-12-01

    Added

    • Core in-memory KeyDir and EntryNode structures
    • API functions: caskyopen, caskyclose, caskyput, caskyget, casky_delete
    • Hash function: caskydjb2hash_xor
    • Error handling via casky_errno
    • Unit tests for all APIs using standard asserts
    • Test cleanup of temporary files

    Changed

    • None (first MVP)

    Fixed

    • None (first MVP)

    Day 2

    [0.20.0] - 2025-12-02


    Work on kqlite (Lightweight remote SQLite with high availability and auto failover). by epenchev

    Description

    Continue the work on kqlite (Lightweight remote SQLite with high availability and auto failover).
    It's a solution for applications that require High Availability but don't need all the features of a complete RDBMS and can fit SQLite in their use case.
    Also kqlite can be considered to be used as a lightweight storage backend for K8s (https://docs.k3s.io/datastore) and the Edge, and allowing to have only 2 Nodes for HA.

    Goals

    Push kqlite to a beta version.
    kqlite as library for Go programs.

    Resources

    https://github.com/kqlite/kqlite


    GRIT: GRaphs In Time by fvanlankvelt

    Description

    The current implementation of the Time-Travelling Topology database, StackGraph, has served SUSE Observability well over the years. But it is dependent on a number of complex components - Zookeeper, HDFS, HBase, Tephra. These lead to a large number of failure scenarios and parameters to tweak for optimal performance.

    The goal of this project is to take the high-level requirements (time-travelling topology, querying over time, transactional changes to topology, scalability) and design/prototype key components, to see where they would lead us if we were to start from scratch today.

    An example would be to use RocksDB to persist topology history. Its user-defined timestamps seem to match well with time-travelling, has transaction support with fine-grained conflict detection.

    Goals

    Determine feasibility of implementing the model on a whole new architecture. See how to model the graph and its history such that updates and querying are performant, transactional conflicts are minimized. Build a prototype to validate the model.

    Resources

    Backend developers, preferably experienced in distributed systems. Programming language: scala 3 with some C++ for low-level.

    Progress

    The project has started at github GRaphs In Time - a C++ project that

    • embeds RocksDB for persistence,
    • uses (nu)Raft for replication/consensus,
    • supports large transactions, with
    • SNAPSHOT isolation
    • time-travel
    • graph operations (add/remove vertex/edge, indexes)


    Port the classic browser game HackTheNet to PHP 8 by dgedon

    Description

    The classic browser game HackTheNet from 2004 still runs on PHP 4/5 and MySQL 5 and needs a port to PHP 8 and e.g. MariaDB.

    Goals

    • Port the game to PHP 8 and MariaDB 11
    • Create a container where the game server can simply be started/stopped

    Resources

    • https://github.com/nodeg/hackthenet


    Collection and organisation of information about Bulgarian schools by iivanov

    Description

    To achieve this it will be necessary:

    • Collect/download raw data from various government and non-governmental organizations
    • Clean up raw data and organise it in some kind database.
    • Create tool to make queries easy.
    • Or perhaps dump all data into AI and ask questions in natural language.

    Goals

    By selecting particular school information like this will be provided:

    • School scores on national exams.
    • School scores from the external evaluations exams.
    • School town, municipality and region.
    • Employment rate in a town or municipality.
    • Average health of the population in the region.

    Resources

    Some of these are available only in bulgarian.

    • https://danybon.com/klasazia
    • https://nvoresults.com/index.html
    • https://ri.mon.bg/active-institutions
    • https://www.nsi.bg/nrnm/ekatte/archive

    Results

    • Information about all Bulgarian schools with their scores during recent years cleaned and organised into SQL tables
    • Information about all Bulgarian villages, cities, municipalities and districts cleaned and organised into SQL tables
    • Information about all Bulgarian villages and cities census since beginning of this century cleaned and organised into SQL tables.
    • Information about all Bulgarian municipalities about religion, ethnicity cleaned and organised into SQL tables.
    • Data successfully loaded to locally running Ollama with help to Vanna.AI
    • Seems to be usable.

    TODO

    • Add more statistical information about municipalities and ....

    Code and data