The goal of this project is two fold.
The first is to better learn and understand why Kubernetes might do something in the way that it does (especially in the control plane)
The second is to create a container orchestration tool like no one has ever seen before.
Sound interesting?
We will have daily meetings june 24th - 28th at 12pm EST where everyone can join and sync what they are doing and plan to do.
The idea is NOT to make another kubernetes. It is to rethink how they did everything.
The basis of Gary is on Promise Theory, I will give a run down of what that is in the first meeting and might write something up if I get the chance.
Want to read more now? check out the docs here on github have a idea? make a PR!
Looking for hackers with the skills:
This project is part of:
Hack Week 18
Activity
Comments
Be the first to comment!
Similar Projects
Exploring Modern AI Trends and Kubernetes-Based AI Infrastructure by jluo
Description
Build a solid understanding of the current landscape of Artificial Intelligence and how modern cloud-native technologies—especially Kubernetes—support AI workloads.
Goals
Use Gemini Learning Mode to guide the exploration, surface relevant concepts, and structure the learning journey:
- Gain insight into the latest AI trends, tools, and architectural concepts.
- Understand how Kubernetes and related cloud-native technologies are used in the AI ecosystem (model training, deployment, orchestration, MLOps).
Resources
Red Hat AI Topic Articles
- https://www.redhat.com/en/topics/ai
Kubeflow Documentation
- https://www.kubeflow.org/docs/
Q4 2025 CNCF Technology Landscape Radar report:
- https://www.cncf.io/announcements/2025/11/11/cncf-and-slashdata-report-finds-leading-ai-tools-gaining-adoption-in-cloud-native-ecosystems/
- https://www.cncf.io/wp-content/uploads/2025/11/cncfreporttechradar_111025a.pdf
Agent-to-Agent (A2A) Protocol
- https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/
Rancher/k8s Trouble-Maker by tonyhansen
Project Description
When studying for my RHCSA, I found trouble-maker, which is a program that breaks a Linux OS and requires you to fix it. I want to create something similar for Rancher/k8s that can allow for troubleshooting an unknown environment.
Goals for Hackweek 25
- Update to modern Rancher and verify that existing tests still work
- Change testing logic to populate secrets instead of requiring a secondary script
- Add new tests
Goals for Hackweek 24 (Complete)
- Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
- Create at least 5 modules that can be applied to the cluster and require troubleshooting
Resources
- https://github.com/celidon/rancher-troublemaker
- https://github.com/rancher/terraform-provider-rancher2
- https://github.com/rancher/tf-rancher-up
- https://github.com/rancher/quickstart
Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0
Self-Scaling LLM Infrastructure Powered by Rancher

Description
The Problem
Running LLMs can get expensive and complex pretty quickly.
Today there are typically two choices:
- Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
- Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.
What if there was a middle ground?
What if infrastructure scaled itself instead of making you scale it?
Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?
Project Repository: github.com/alexander-demicev/llmserverless
What This Project Does
A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.
A complete, self-scaling LLM infrastructure that:
- Scales to zero when idle (no idle costs)
- Scales up automatically when requests come in
- Adds more nodes when needed, removes them when demand drops
- Runs on any infrastructure - laptop, bare metal, or cloud
Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.
How It Works
A combination of open source tools working together:
Flow:
- Users interact with OpenWebUI (chat interface)
- Requests go to LiteLLM Gateway
- LiteLLM routes requests to:
- Ollama (Knative) for local model inference (auto-scales pods)
- Or cloud APIs for fallback
Kubernetes-Based ML Lifecycle Automation by lmiranda
Description
This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.
The pipeline will automate the lifecycle of a machine learning model, including:
- Data ingestion/collection
- Model training as a Kubernetes Job
- Model artifact storage in an S3-compatible registry (e.g. Minio)
- A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
- A lightweight inference service that loads and serves the latest model
- Monitoring of model performance and service health through Prometheus/Grafana
The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.
Goals
By the end of Hack Week, the project should:
Produce a fully functional ML pipeline running on Kubernetes with:
- Data collection job
- Training job container
- Storage and versioning of trained models
- Automated deployment of new model versions
- Model inference API service
- Basic monitoring dashboards
Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.
Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).
Prepare a short demo explaining the end-to-end process and how new models flow through the system.
Resources
Preparing KubeVirtBMC for project transfer to the KubeVirt organization by zchang
Description
KubeVirtBMC is preparing to transfer the project to the KubeVirt organization. One requirement is to enhance the modeling design's security. The current v1alpha1 API (the VirtualMachineBMC CRD) was designed during the proof-of-concept stage. It's immature and inherently insecure due to its cross-namespace object references, exposing security concerns from an RBAC perspective.
The other long-awaited feature is the ability to mount virtual media so that virtual machines can boot from remote ISO images.
Goals
- Deliver the v1beta1 API and its corresponding controller implementation
- Enable the Redfish virtual media mount function for KubeVirt virtual machines
Resources
- The KubeVirtBMC repo: https://github.com/starbops/kubevirtbmc
- The new v1beta1 API: https://github.com/starbops/kubevirtbmc/issues/83
- Redfish virtual media mount: https://github.com/starbops/kubevirtbmc/issues/44
Build a terminal user-interface (TUI) for Agama by IGonzalezSosa
Description
Officially, Agama offers two different user interfaces. On the one hand, we have the web-based interface, which is the one you see when you run the installation media. On the other hand, we have a command-line interface. In both cases, you can use them using a remote system, either using a browser or the agama CLI.
We would expect most of the cases to be covered by this approach. However, if you cannot use the web-based interface and, for some reason, you cannot access the system through the network, your only option is to use the CLI. This interface offers a mechanism to modify Agama's configuration using an editor (vim, by default), but perhaps you might want to have a more user-friendly way.
Goals
The main goal of this project is to built a minimal terminal user-interface for Agama. This interface will allow the user to install the system providing just a few settings (selecting a product, a storage device and a user password). Then it should report the installation progress.
Resources
- https://agama-project.github.io/
- https://ratatui.rs/
Conclusions
We have summarized our conclusions in a pull request. It includes screenshots ;-) We did not implement all the features we wanted, but we learn a lot during the process. We know that, if needed, we could write a TUI for Agama and we have an idea about how to build it. Good enough.
AI-Powered Unit Test Automation for Agama by joseivanlopez
The Agama project is a multi-language Linux installer that leverages the distinct strengths of several key technologies:
- Rust: Used for the back-end services and the core HTTP API, providing performance and safety.
- TypeScript (React/PatternFly): Powers the modern web user interface (UI), ensuring a consistent and responsive user experience.
- Ruby: Integrates existing, robust YaST libraries (e.g.,
yast-storage-ng) to reuse established functionality.
The Problem: Testing Overhead
Developing and maintaining code across these three languages requires a significant, tedious effort in writing, reviewing, and updating unit tests for each component. This high cost of testing is a drain on developer resources and can slow down the project's evolution.
The Solution: AI-Driven Automation
This project aims to eliminate the manual overhead of unit testing by exploring and integrating AI-driven code generation tools. We will investigate how AI can:
- Automatically generate new unit tests as code is developed.
- Intelligently correct and update existing unit tests when the application code changes.
By automating this crucial but monotonous task, we can free developers to focus on feature implementation and significantly improve the speed and maintainability of the Agama codebase.
Goals
- Proof of Concept: Successfully integrate and demonstrate an authorized AI tool (e.g.,
gemini-cli) to automatically generate unit tests. - Workflow Integration: Define and document a new unit test automation workflow that seamlessly integrates the selected AI tool into the existing Agama development pipeline.
- Knowledge Sharing: Establish a set of best practices for using AI in code generation, sharing the learned expertise with the broader team.
Contribution & Resources
We are seeking contributors interested in AI-powered development and improving developer efficiency. Whether you have previous experience with code generation tools or are eager to learn, your participation is highly valuable.
If you want to dive deep into AI for software quality, please reach out and join the effort!
- Authorized AI Tools: Tools supported by SUSE (e.g.,
gemini-cli) - Focus Areas: Rust, TypeScript, and Ruby components within the Agama project.
Interesting Links
Learn how to use the Relm4 Rust GUI crate by xiaoguang_wang
Relm4 is based on gtk4-rs and compatible with libadwaita. The gtk4-rs crate provides all the tools necessary to develop applications. Building on this foundation, Relm4 makes developing more idiomatic, simpler, and faster.
https://github.com/Relm4/Relm4
Learn a bit of embedded programming with Rust in a micro:bit v2 by aplanas
Description
micro:bit is a small single board computer with a ARM Cortex-M4 with the FPU extension, with a very constrain amount of memory and a bunch of sensors and leds.
The board is very well documented, with schematics and code for all the features available, so is an excellent platform for learning embedded programming.
Rust is a system programming language that can generate ARM code, and has crates (libraries) to access the micro:bit hardware. There is plenty documentation about how to make small programs that will run in the micro:bit.
Goals
Start learning about embedded programming in Rust, and maybe make some code to the small KS4036F Robot car from keyestudio.
Resources
- micro:bit
- KS4036F
- microbit technical documentation
- schematic
- impl Rust for micro:bit
- Rust Embedded MB2 Discovery Book
- nRF-HAL
- nRF Microbit-v2 BSP (blocking)
- knurling-rs
- C++ microbit codal
- microbit-bsp for Embassy
- Embassy
Diary
Day 1
- Start reading https://mb2.implrust.com/abstraction-layers.html
- Prepare the dev environment (cross compiler, probe-rs)
- Flash first code in the board (blinky led)
- Checking differences between BSP and HAL
- Compile and install a more complex example, with stack protection
- Reading about the simplicity of xtask, as alias for workspace execution
- Reading the CPP code of the official micro:bit libraries. They have a font!
Day 2
- There are multiple BSP for the microbit. One is using async code for non-blocking operations
- Download and study a bit the API for microbit-v2, the nRF official crate
- Take a look of the KS4036F programming, seems that the communication is multiplexed via I2C
- The motor speed can be selected via PWM (pulse with modulation): power it longer (high frequency), and it will increase the speed
- Scrolling some text
- Debug by printing! defmt is a crate that can be used with probe-rs to emit logs
- Start reading input from the board: buttons
- The logo can be touched and detected as a floating point value
Day 3
- A bit confused how to read the float value from a pin
OpenPlatform Self-Service Portal by tmuntan1
Description
In SUSE IT, we developed an internal developer platform for our engineers using SUSE technologies such as RKE2, SUSE Virtualization, and Rancher. While it works well for our existing users, the onboarding process could be better.
To improve our customer experience, I would like to build a self-service portal to make it easy for people to accomplish common actions. To get started, I would have the portal create Jira SD tickets for our customers to have better information in our tickets, but eventually I want to add automation to reduce our workload.
Goals
- Build a frontend website (Angular) that helps customers create Jira SD tickets.
- Build a backend (Rust with Axum) for the backend, which would do all the hard work for the frontend.
Resources