Description

For now installing Uyuni on Kubernetes requires running mgradm on a cluster node... which is not what users would do in the Kubernetes world. The idea is to implement an installation based only on helm charts and probably an operator.

Goals

Install Uyuni from Rancher UI.

Resources

Looking for hackers with the skills:

uyuni kubernetes golang operator

This project is part of:

Hack Week 24

Activity

  • about 1 year ago: ncarmo liked this project.
  • about 1 year ago: j_renner liked this project.
  • about 1 year ago: vizhestkov liked this project.
  • about 1 year ago: jmeza liked this project.
  • about 1 year ago: wombelix liked this project.
  • about 1 year ago: cbosdonnat added keyword "uyuni" to this project.
  • about 1 year ago: cbosdonnat added keyword "kubernetes" to this project.
  • about 1 year ago: cbosdonnat added keyword "golang" to this project.
  • about 1 year ago: cbosdonnat added keyword "operator" to this project.
  • about 1 year ago: ygutierrez liked this project.
  • about 1 year ago: joachimwerner liked this project.
  • about 1 year ago: atgracey liked this project.
  • about 1 year ago: juliogonzalezgil liked this project.
  • about 1 year ago: dgedon liked this project.
  • about 1 year ago: cbosdonnat started this project.
  • about 1 year ago: cbosdonnat originated this project.

  • Comments

    • cbosdonnat
      about 1 year ago by cbosdonnat | Reply

      At the end of the hackweek 24, the result is very encouraging:

      • The server setup can now run in a Job instead of inside the running deployment
      • The server installs correctly and the deployment is ready
      • Salt systems can bootstrap when using LoadBalancer services on k3s.
      • Uninstalling the custom server resource cleans everything out of the box.
      • The only things the user needs is to define the secrets and SSL certificates or the issuers for cert-manager, as well as an uyuni server custom resource.

      The code:

      What's next:

      • Implement migration from an old RPM-based server
      • Implement update / upgrade of the server
      • Play with more network setups
      • Test with more kubernetes distros

    • cbosdonnat
      about 1 year ago by cbosdonnat | Reply

      Marked the project as completed as the initial stage is complete. PRs will eventually be polished and merged

    • cbosdonnat
      about 1 year ago by cbosdonnat | Reply

      Demo YAML file and video are available in https://github.com/cbosdo/uyuni-operator/tree/main/docs

    Similar Projects

    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources

    Outcome


    Move Uyuni Test Framework from Selenium to Playwright + AI by oscar-barrios

    Description

    This project aims to migrate the existing Uyuni Test Framework from Selenium to Playwright. The move will improve the stability, speed, and maintainability of our end-to-end tests by leveraging Playwright's modern features. We'll be rewriting the current Selenium code in Ruby to Playwright code in TypeScript, which includes updating the test framework runner, step definitions, and configurations. This is also necessary because we're moving from Cucumber Ruby to CucumberJS.

    If you're still curious about the AI in the title, it was just a way to grab your attention. Thanks for your understanding.

    Nah, let's be honest add-emoji AI helped a lot to vibe code a good part of the Ruby methods of the Test framework, moving them to Typescript, along with the migration from Capybara to Playwright. I've been using "Cline" as plugin for WebStorm IDE, using Gemini API behind it.


    Goals

    • Migrate Core tests including Onboarding of clients
    • Improve test reliabillity: Measure and confirm a significant reduction of flakiness.
    • Implement a robust framework: Establish a well-structured and reusable Playwright test framework using the CucumberJS

    Resources


    Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil

    Join the Gitter channel! https://gitter.im/uyuni-project/hackweek

    Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!

    Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.

    For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.

    No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)

    The idea is testing Salt and Salt-ssh clients, but NOT traditional clients, which are deprecated.

    To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):

    1. Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    2. Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    3. Package management (install, remove, update...)
    4. Patching
    5. Applying any basic salt state (including a formula)
    6. Salt remote commands
    7. Bonus point: Java part for product identification, and monitoring enablement
    8. Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    9. Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
    10. Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)

    If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)

    • If you don't have knowledge about some of the steps: ask the team
    • If you still don't know what to do: switch to another distribution and keep testing.

    This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)

    Pending

    Debian 13

    The new version of the beloved Debian GNU/Linux OS

    • [W] Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    • [ ] Onboarding (salt minion from UI, salt minion from bootstrap script, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    • [ ] Package management (install, remove, update...)
    • [ ] Patching (if patch information is available, could require writing some code to parse it, but IIRC we have support for Ubuntu already). Probably not for Debian as IIRC we don't support patches yet.
    • [ ] Applying any basic salt state (including a formula)
    • [ ] Salt remote commands
    • [ ] Bonus point: Java part for product identification, and monitoring enablement
    • [ ] Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    • [ ] Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
    • [ ] Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)


    Enable more features in mcp-server-uyuni by j_renner

    Description

    I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

    • System operations and infos
    • System groups
    • Maintenance windows
    • Ansible
    • Reporting
    • ...

    Goals

    • Set up test environment locally with the MCP server and client + a recent MLM server and 1-2 managed clients
    • Identify features and use cases offering a benefit with limited effort required for enablement
    • Create a PR to the repo

    Resources


    Set Uyuni to manage edge clusters at scale by RDiasMateus

    Description

    Prepare a Poc on how to use MLM to manage edge clusters. Those cluster are normally equal across each location, and we have a large number of them.

    The goal is to produce a set of sets/best practices/scripts to help users manage this kind of setup.

    Goals

    step 1: Manual set-up

    Goal: Have a running application in k3s and be able to update it using System Update Controler (SUC)

    • Deploy Micro 6.2 machine
    • Deploy k3s - single node

      • https://docs.k3s.io/quick-start
    • Build/find a simple web application (static page)

      • Build/find a helmchart to deploy the application
    • Deploy the application on the k3s cluster

    • Install App updates through helm update

    • Install OS updates using MLM

    step 2: Automate day 1

    Goal: Trigger the application deployment and update from MLM

    • Salt states For application (with static data)
      • Deploy the application helmchart, if not present
      • install app updates through helmchart parameters
    • Link it to GIT
      • Define how to link the state to the machines (based in some pillar data? Using configuration channels by importing the state? Naming convention?)
      • Use git update to trigger helmchart app update
    • Recurrent state applying configuration channel?

    step 3: Multi-node cluster

    Goal: Use SUC to update a multi-node cluster.

    • Create a multi-node cluster
    • Deploy application
      • call the helm update/install only on control plane?
    • Install App updates through helm update
    • Prepare a SUC for OS update (k3s also? How?)
      • https://github.com/rancher/system-upgrade-controller
      • https://documentation.suse.com/cloudnative/k3s/latest/en/upgrades/automated.html
      • Update/deploy the SUC?
      • Update/deploy the SUC CRD with the update procedure


    Rancher/k8s Trouble-Maker by tonyhansen

    Project Description

    When studying for my RHCSA, I found trouble-maker, which is a program that breaks a Linux OS and requires you to fix it. I want to create something similar for Rancher/k8s that can allow for troubleshooting an unknown environment.

    Goals for Hackweek 25

    • Update to modern Rancher and verify that existing tests still work
    • Change testing logic to populate secrets instead of requiring a secondary script
    • Add new tests

    Goals for Hackweek 24 (Complete)

    • Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
    • Create at least 5 modules that can be applied to the cluster and require troubleshooting

    Resources

    • https://github.com/celidon/rancher-troublemaker
    • https://github.com/rancher/terraform-provider-rancher2
    • https://github.com/rancher/tf-rancher-up
    • https://github.com/rancher/quickstart


    Kubernetes-Based ML Lifecycle Automation by lmiranda

    Description

    This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.

    The pipeline will automate the lifecycle of a machine learning model, including:

    • Data ingestion/collection
    • Model training as a Kubernetes Job
    • Model artifact storage in an S3-compatible registry (e.g. Minio)
    • A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
    • A lightweight inference service that loads and serves the latest model
    • Monitoring of model performance and service health through Prometheus/Grafana

    The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.

    Goals

    By the end of Hack Week, the project should:

    1. Produce a fully functional ML pipeline running on Kubernetes with:

      • Data collection job
      • Training job container
      • Storage and versioning of trained models
      • Automated deployment of new model versions
      • Model inference API service
      • Basic monitoring dashboards
    2. Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.

    3. Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).

    4. Prepare a short demo explaining the end-to-end process and how new models flow through the system.

    Resources

    Project Repository


    The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

    Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it. A GitHub robot mascot trying to lasso a blue bull with a Kubernetes logo tatooed on it


    The Plan

    Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!

    Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:


    The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

    The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

    Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.


    If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

    Why?

    We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.

    The Outputs

    ❥ A "State of the Agentic Union" for SUSE engineers, detailing what works, what explodes, and how much coffee we can drink while the robots do the rebasing.

    ❥ Honest, Daily Updates With All the Gory Details


    Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0

    Self-Scaling LLM Infrastructure Powered by Rancher

    logo


    Description

    The Problem

    Running LLMs can get expensive and complex pretty quickly.

    Today there are typically two choices:

    1. Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
    2. Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.

    What if there was a middle ground?

    What if infrastructure scaled itself instead of making you scale it?

    Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?

    Project Repository: github.com/alexander-demicev/llmserverless


    What This Project Does

    A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.

    A complete, self-scaling LLM infrastructure that:

    • Scales to zero when idle (no idle costs)
    • Scales up automatically when requests come in
    • Adds more nodes when needed, removes them when demand drops
    • Runs on any infrastructure - laptop, bare metal, or cloud

    Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.

    How It Works

    A combination of open source tools working together:

    Flow:

    • Users interact with OpenWebUI (chat interface)
    • Requests go to LiteLLM Gateway
    • LiteLLM routes requests to:
      • Ollama (Knative) for local model inference (auto-scales pods)
      • Or cloud APIs for fallback


    Technical talks at universities by agamez

    Description

    This project aims to empower the next generation of tech professionals by offering hands-on workshops on containerization and Kubernetes, with a strong focus on open-source technologies. By providing practical experience with these cutting-edge tools and fostering a deep understanding of open-source principles, we aim to bridge the gap between academia and industry.

    For now, the scope is limited to Spanish universities, since we already have the contacts and have started some conversations.

    Goals

    • Technical Skill Development: equip students with the fundamental knowledge and skills to build, deploy, and manage containerized applications using open-source tools like Kubernetes.
    • Open-Source Mindset: foster a passion for open-source software, encouraging students to contribute to open-source projects and collaborate with the global developer community.
    • Career Readiness: prepare students for industry-relevant roles by exposing them to real-world use cases, best practices, and open-source in companies.

    Resources

    • Instructors: experienced open-source professionals with deep knowledge of containerization and Kubernetes.
    • SUSE Expertise: leverage SUSE's expertise in open-source technologies to provide insights into industry trends and best practices.


    Play with the userfaultfd(2) system call and download on demand using HTTP Range Requests with Golang by rbranco

    Description

    The userfaultfd(2) is a cool system call to handle page faults in user-space. This should allow me to list the contents of an ISO or similar archive without downloading the whole thing. The userfaultfd(2) part can also be done in theory with the PROT_NONE mprotect + SIGSEGV trick, for complete Unix portability, though reportedly being slower.

    Goals

    1. Create my own library for userfaultfd(2) in Golang.
    2. Create my own library for HTTP Range Requests.
    3. Complete portability with Unix.
    4. Benchmarks.
    5. Contribute some tests to LTP.

    Resources

    1. https://docs.kernel.org/admin-guide/mm/userfaultfd.html
    2. https://www.cons.org/cracauer/cracauer-userfaultfd.html


    Rewrite Distrobox in go (POC) by fabriziosestito

    Description

    Rewriting Distrobox in Go.

    Main benefits:

    • Easier to maintain and to test
    • Adapter pattern for different container backends (LXC, systemd-nspawn, etc.)

    Goals

    • Build a minimal starting point with core commands
    • Keep the CLI interface compatible: existing users shouldn't notice any difference
    • Use a clean Go architecture with adapters for different container backends
    • Keep dependencies minimal and binary size small
    • Benchmark against the original shell script

    Resources

    • Upstream project: https://github.com/89luca89/distrobox/
    • Distrobox site: https://distrobox.it/
    • ArchWiki: https://wiki.archlinux.org/title/Distrobox


    go-git: unlocking SHA256-based repository cloning ahead of git v3 by pgomes

    Description

    The go-git library implements the git internals in pure Go, so that any Go application can handle not only Git repositories, but also lower-level primitives (e.g. packfiles, idxfiles, etc) without needing to shell out to the git binary.

    The focus for this Hackweek is to fast track key improvements for the project ahead of the upstream release of Git V3, which may take place at some point next year.

    Goals

    Stretch goals

    Resources

    • https://github.com/go-git/go-git/
    • https://go-git.github.io/docs/


    Create a Cloud-Native policy engine with notifying capabilities to optimize resource usage by gbazzotti

    Description

    The goal of this project is to begin the initial phase of development of an all-in-one Cloud-Native Policy Engine that notifies resource owners when their resources infringe predetermined policies. This was inspired by a current issue in the CES-SRE Team where other solutions seemed to not exactly correspond to the needs of the specific workloads running on the Public Cloud Team space.

    The initial architecture can be checked out on the Repository listed under Resources.

    Among the features that will differ this project from other monitoring/notification systems:

    • Pre-defined sensible policies written at the software-level, avoiding a learning curve by requiring users to write their own policies
    • All-in-one functionality: logging, mailing and all other actions are not required to install any additional plugins/packages
    • Easy account management, being able to parse all required configuration by a single JSON file
    • Eliminate integrations by not requiring metrics to go through a data-agreggator

    Goals

    • Create a minimal working prototype following the workflow specified on the documentation
    • Provide instructions on installation/usage
    • Work on email notifying capabilities

    Resources


    SUSE Health Check Tools by roseswe

    SUSE HC Tools Overview

    A collection of tools written in Bash or Go 1.24++ to make life easier with handling of a bunch of tar.xz balls created by supportconfig.

    Background: For SUSE HC we receive a bunch of supportconfig tar balls to check them for misconfiguration, areas for improvement or future changes.

    Main focus on these HC are High Availability (pacemaker), SLES itself and SAP workloads, esp. around the SUSE best practices.

    Goals

    • Overall improvement of the tools
    • Adding new collectors
    • Add support for SLES16

    Resources

    csv2xls* example.sh go.mod listprodids.txt sumtext* trails.go README.md csv2xls.go exceltest.go go.sum m.sh* sumtext.go vercheck.py* config.ini csvfiles/ getrpm* listprodids* rpmdate.sh* sumxls* verdriver* credtest.go example.py getrpm.go listprodids.go sccfixer.sh* sumxls.go verdriver.go

    docollall.sh* extracthtml.go gethostnamectl* go.sum numastat.go cpuvul* extractcluster.go firmwarebug* gethostnamectl.go m.sh* numastattest.go cpuvul.go extracthtml* firmwarebug.go go.mod numastat* xtr_cib.sh*

    $ getrpm -r pacemaker >> Product ID: 2795 (SUSE Linux Enterprise Server for SAP Applications 15 SP7 x86_64), RPM Name: +--------------+----------------------------+--------+--------------+--------------------+ | Package Name | Version | Arch | Release | Repository | +--------------+----------------------------+--------+--------------+--------------------+ | pacemaker | 2.1.10+20250718.fdf796ebc8 | x86_64 | 150700.3.3.1 | sle-ha/15.7/x86_64 | | pacemaker | 2.1.9+20250410.471584e6a2 | x86_64 | 150700.1.9 | sle-ha/15.7/x86_64 | +--------------+----------------------------+--------+--------------+--------------------+ Total packages found: 2