This is mostly a learning activity for myself, others may benefit from documentation.
Project Description
Practical setup of a k3s HA cluster
Goal for this Hackweek
Understand the concept, get the cluster up and running workloads. Create documentation that others can follow.
Resources
Use my workstation, or other available hardware. Probably utilize MicroOS.
Looking for hackers with the skills:
This project is part of:
Hack Week 22
Activity
Comments
Similar Projects
Preparing KubeVirtBMC for project transfer to the KubeVirt organization by zchang
Description
KubeVirtBMC is preparing to transfer the project to the KubeVirt organization. One requirement is to enhance the modeling design's security. The current v1alpha1 API (the VirtualMachineBMC CRD) was designed during the proof-of-concept stage. It's immature and inherently insecure due to its cross-namespace object references, exposing security concerns from an RBAC perspective.
The other long-awaited feature is the ability to mount virtual media so that virtual machines can boot from remote ISO images.
Goals
- Deliver the v1beta1 API and its corresponding controller implementation
- Enable the Redfish virtual media mount function for KubeVirt virtual machines
Resources
- The KubeVirtBMC repo: https://github.com/starbops/kubevirtbmc
- The new v1beta1 API: https://github.com/starbops/kubevirtbmc/issues/83
- Redfish virtual media mount: https://github.com/starbops/kubevirtbmc/issues/44
Exploring Modern AI Trends and Kubernetes-Based AI Infrastructure by jluo
Description
Build a solid understanding of the current landscape of Artificial Intelligence and how modern cloud-native technologies—especially Kubernetes—support AI workloads.
Goals
Use Gemini Learning Mode to guide the exploration, surface relevant concepts, and structure the learning journey:
- Gain insight into the latest AI trends, tools, and architectural concepts.
- Understand how Kubernetes and related cloud-native technologies are used in the AI ecosystem (model training, deployment, orchestration, MLOps).
Resources
Red Hat AI Topic Articles
- https://www.redhat.com/en/topics/ai
Kubeflow Documentation
- https://www.kubeflow.org/docs/
Q4 2025 CNCF Technology Landscape Radar report:
- https://www.cncf.io/announcements/2025/11/11/cncf-and-slashdata-report-finds-leading-ai-tools-gaining-adoption-in-cloud-native-ecosystems/
- https://www.cncf.io/wp-content/uploads/2025/11/cncfreporttechradar_111025a.pdf
Agent-to-Agent (A2A) Protocol
- https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/
Rancher/k8s Trouble-Maker by tonyhansen
Project Description
When studying for my RHCSA, I found trouble-maker, which is a program that breaks a Linux OS and requires you to fix it. I want to create something similar for Rancher/k8s that can allow for troubleshooting an unknown environment.
Goals for Hackweek 25
- Update to modern Rancher and verify that existing tests still work
- Change testing logic to populate secrets instead of requiring a secondary script
- Add new tests
Goals for Hackweek 24 (Complete)
- Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
- Create at least 5 modules that can be applied to the cluster and require troubleshooting
Resources
- https://github.com/celidon/rancher-troublemaker
- https://github.com/rancher/terraform-provider-rancher2
- https://github.com/rancher/tf-rancher-up
- https://github.com/rancher/quickstart
Technical talks at universities by agamez
Description
This project aims to empower the next generation of tech professionals by offering hands-on workshops on containerization and Kubernetes, with a strong focus on open-source technologies. By providing practical experience with these cutting-edge tools and fostering a deep understanding of open-source principles, we aim to bridge the gap between academia and industry.
For now, the scope is limited to Spanish universities, since we already have the contacts and have started some conversations.
Goals
- Technical Skill Development: equip students with the fundamental knowledge and skills to build, deploy, and manage containerized applications using open-source tools like Kubernetes.
- Open-Source Mindset: foster a passion for open-source software, encouraging students to contribute to open-source projects and collaborate with the global developer community.
- Career Readiness: prepare students for industry-relevant roles by exposing them to real-world use cases, best practices, and open-source in companies.
Resources
- Instructors: experienced open-source professionals with deep knowledge of containerization and Kubernetes.
- SUSE Expertise: leverage SUSE's expertise in open-source technologies to provide insights into industry trends and best practices.
Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0
Self-Scaling LLM Infrastructure Powered by Rancher

Description
The Problem
Running LLMs can get expensive and complex pretty quickly.
Today there are typically two choices:
- Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
- Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.
What if there was a middle ground?
What if infrastructure scaled itself instead of making you scale it?
Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?
Project Repository: github.com/alexander-demicev/llmserverless
What This Project Does
A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.
A complete, self-scaling LLM infrastructure that:
- Scales to zero when idle (no idle costs)
- Scales up automatically when requests come in
- Adds more nodes when needed, removes them when demand drops
- Runs on any infrastructure - laptop, bare metal, or cloud
Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.
How It Works
A combination of open source tools working together:
Flow:
- Users interact with OpenWebUI (chat interface)
- Requests go to LiteLLM Gateway
- LiteLLM routes requests to:
- Ollama (Knative) for local model inference (auto-scales pods)
- Or cloud APIs for fallback
Advent of Code: The Diaries by amanzini
Description
It was the Night Before Compile Time ...
Hackweek 25 (December 1-5) perfectly coincides with the first five days of Advent of Code 2025. This project will leverage this overlap to participate in the event in real-time.
To add a layer of challenge and exploration (in the true spirit of Hackweek), the puzzles will be solved using a non-mainstream, modern language like Ruby, D, Crystal, Gleam or Zig.
The primary project intent is not just simply to solve the puzzles, but to exercise result sharing and documentation. I'd create a public-facing repository documenting the process. This involves treating each day's puzzle as a mini-project: solving it, then documenting the solution with detailed write-ups, analysis of the language's performance and ergonomics, and visualizations.
|
\ ' /
-- (*) --
>*<
>0<@<
>>>@<<*
>@>*<0<<<
>*>>@<<<@<<
>@>>0<<<*<<@<
>*>>0<<@<<<@<<<
>@>>*<<@<>*<<0<*<
\*/ >0>>*<<@<>0><<*<@<<
___\\U//___ >*>>@><0<<*>>@><*<0<<
|\\ | | \\| >@>>0<*<0>>@<<0<<<*<@<<
| \\| | _(UU)_ >((*))_>0><*<0><@<<<0<*<
|\ \| || / //||.*.*.*.|>>@<<*<<@>><0<<<
|\\_|_|&&_// ||*.*.*.*|_\\db//_
""""|'.'.'.|~~|.*.*.*| ____|_
|'.'.'.| ^^^^^^|____|>>>>>>|
~~~~~~~~ '""""`------'
------------------------------------------------
This ASCII pic can be found at
https://asciiart.website/art/1831
Goals
Code, Docs, and Memes: An AoC Story
Have fun!
Involve more people, play together
Solve Days 1-5: Successfully solve both parts of the Advent of Code 2025 puzzles for Days 1-5 using the chosen non-mainstream language.
Daily Documentation & Language Review: Publish a detailed write-up for each day. This documentation will include the solution analysis, the chosen algorithm, and specific commentary on the language's ergonomics, performance, and standard library for the given task.
Kubernetes-Based ML Lifecycle Automation by lmiranda
Description
This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.
The pipeline will automate the lifecycle of a machine learning model, including:
- Data ingestion/collection
- Model training as a Kubernetes Job
- Model artifact storage in an S3-compatible registry (e.g. Minio)
- A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
- A lightweight inference service that loads and serves the latest model
- Monitoring of model performance and service health through Prometheus/Grafana
The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.
Goals
By the end of Hack Week, the project should:
Produce a fully functional ML pipeline running on Kubernetes with:
- Data collection job
- Training job container
- Storage and versioning of trained models
- Automated deployment of new model versions
- Model inference API service
- Basic monitoring dashboards
Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.
Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).
Prepare a short demo explaining the end-to-end process and how new models flow through the system.
Resources
Updates
- Training pipeline and datasets
- Inference Service py