Description

As we delve deeper into the complexities of managing multiple CRD versions within a single Kubernetes cluster, I want to introduce "Bottles" - a proof of concept that aims to address these challenges.

Bottles propose a novel approach to isolating and deploying different CRD versions in a self-contained environment. This would allow for greater flexibility and efficiency in managing diverse workloads.

Goals

Evaluate Feasibility: determine if this approach is technically viable, as well as identifying possible obstacles and limitations.
Reuse existing technology: leverage existing products whenever possible, e.g. build on top of Kubewarden as admission controller.
Focus on Rancher's use case: the ultimate goal is to be able to use this approach to solve Rancher users' needs.

Resources

Core concepts:

ConfigMaps: Bottles could be defined and configured using ConfigMaps.
Admission Controller: An admission controller will detect "bootled" CRDs being installed and replace the resource name used to store them.
Aggregated API Server: By analyzing the author of a request, the aggregated API server will determine the correct bottle and route the request accordingly, making it transparent for the user.

Join this project Leave this project

Looking for hackers with the skills:

rancher kubernetes poc

This project is part of:

Hack Week 24

Activity

about 1 year ago: rapetz joined this project.

about 1 year ago: rapetz liked this project.

about 1 year ago: moio liked this project.

over 1 year ago: aruiz started this project.

over 1 year ago: aruiz added keyword "rancher" to this project.

over 1 year ago: aruiz added keyword "kubernetes" to this project.

over 1 year ago: aruiz added keyword "poc" to this project.

over 1 year ago: aruiz originated this project.

Comments

about 1 year ago by aruiz | Reply

We started the week by creating a rough plan of the areas we wanted to explore, in order to divide the problem into smaller parts and identify further areas of work.

Rough Plan
- Create example CRDs that allowed experimenting in our local cluster without breaking it.
  - Nothing really fancy, manually crafted. E.g. copy an existing one from the cluster and just rename it.
- A Kubernetes controller skeleton to use as the base.
  - Likely based on Kubebuilder/controller-runtime
- Scripts for bringing up&down a test/dev environment.
- Explore admission controllers:
  - Can we use kubewarden? at least for some parts?
  - Requirements and possible functional alternatives.
- Explore APIServices
  - What's the state-of-the-art for building APIServices? Does controller-runtime support it?
  - Is it possible to extract user information from requests? Hopefully without having to terminate auth here.
  - Does it support redirects?
I talked to Rafa about the project and going over the different areas of exploration.

about 1 year ago by aruiz | Reply

I started by exploring Kubewarden to check if we could just use a policy to manage the mapping of CRDs being installed along with a Bottle; the idea was to perform this transformation before the CRD is stored in Kubernetes, so we would need to use Admission/Mutating webhooks.

I concluded that Kuberwarden was not a good fit for this because:
- Go policies are compiled with TinyGo, due to limited support for WebAssembly in the official toolchain. TinyGo still has some limitations for building the Go standard library, which prevents us from using the k8s.io libraries, which we would need to perform the desired steps, as they will require a Kubernetes client.
- CEL policies won't provide enough flexibility for our purpose.
- Rust-based policies could be an option, but probably require a bigger effort to implement (and differ from the language used from the rest of the project).
I confirmed that Kubebuilder has support for writing defaulting webhooks, so we could use it for modifying the CRDs before they get persisted.

This framework also creates scripts and resources for building the controller's image, as well as the manifests to install it in Kubernetes. Although it's very focused on adding your own APIs, with some tweaks we could use it to generate such boiler-plate for a built-in type (CRDs).

These "difficulties" made me think of alternatives approaches for our goal. Assuming that Helm charts is the selected installation mechanism, we could:
- Create an offline tool whose input is the complete manifests (including the Bottle spec), and produces the required modifications, so that they can be directly applied to Kubernetes.
- This could be a kubectl (krew) or helm plugin.
- An offline transformation is a simpler solution, since it moves the processing client-side (allows dry-run, store definitive manifest for GitOps, etc.), while makes the approach less transparent.

about 1 year ago by aruiz | Reply

Since we had already used a few days, I decided to prioritize exploring the rest of the areas instead of keep building on the admission part, since this was easier to fake in order for the rest to keep working.

So I started looking into our options for implementing the Aggregation layer in Kubernetes. Besides creating our own CRDs, which is not what we were looking for, the docs suggested 2 options (note that Kubebuilder/controller-runtime is not intended for this use case and have no support for this):
- Use kubernetes-incubator/apiserver-builder (now named kubernetes-sigs/apiserver-builder-alpha), which aims to provide a similar pattern to Kubebuilder.
  - However, it seems to not be actively maintained (latest release was >2 years ago, no Go modules support, and lack of activity in general.)
- Use the sample-apiserver project, which seems to be the base for the apiserver-builder generator, but is more up-to-date.
- Both options seem to build on top of the apiserver-runtime framework, which as I understand is equivalent to controller-runtime for regular controllers. Nonetheless, its last commit was almost one year ago.
I put some efforts into trying to make apiserver-builder work, including to make the generated project Go-modules aware, but then faced many problem upgrading Kubernetes dependencies to recent versions, so I gave up on that option.

In this situation, forking sample-apiserver and start modifying it to our needs looked like the best option to go forward.

This sample project does work, but has very low-level requirements. In particular, it's meant to have access to Etcd itself, in order to serve the APIService. This option was not in our initial plans, as it would make it harder to run our controller. Nonetheless, I also found that the interface implemented does not necessarily have to be backed by Etcd, which brings the option of using a different storage (e.g. any database), as long as the interface methods are implemented. I decided to not pursue this route just yet, though, since it was out of the initial scope and was running out of time. For the sake of the experiment, I tweaked the endpoint to use a Kubernetes client to try to obtain the original data from the main API server and then transforming it. However, this obviously produced an infinite loop, since the control plane would just redirect such requests back to our APIService.

The last thing that time allowed me to experiment with was the authorization part, as we need to identify which user produced the requests. Even though there is functionality for this, I couldn't manage to make it available to my handlers implementation, and wasn't able to identify why. I need to read more docs about how the workflows for Aggregated APIs, maybe authorization is meant to be resolved by APIServices directly? Sadly, the apiserver-runtime library is not very well documented.

12 months ago by Jasbon | Reply

This looks like a detailed breakdown of the "Bottles" proof of concept for managing multi-version CRDs in Kubernetes. Are you looking for a summary, feedback, or help with a specific part of the project?

Similar Projects

rancher

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it.

The Plan

Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!

Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:

❥ The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

❥ The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

❥ Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.

If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

Why?

We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.

The CONCLUSION!!!

A State of the Union document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below!

Cluster API Provider for Harvester by rcase

Project Description

The Cluster API "infrastructure provider" for Harvester, also named CAPHV, makes it possible to use Harvester with Cluster API. This enables people and organisations to create Kubernetes clusters running on VMs created by Harvester using a declarative spec.

The project has been bootstrapped in HackWeek 23, and its code is available here.

Work done in HackWeek 2023

Have a early working version of the provider available on Rancher Sandbox : *DONE *
Demonstrated the created cluster can be imported using Rancher Turtles: DONE
Stretch goal - demonstrate using the new provider with CAPRKE2: DONE and the templates are available on the repo

DONE in HackWeek 24:

Add more Unit Tests
Improve Status Conditions for some phases
Add cloud provider config generation
Testing with Harvester v1.3.2
Template improvements
Issues creation

DONE in 2025 (out of Hackweek)

Support of ClusterClass
Add to clusterctl community providers, you can add it directly with clusterctl
Testing on newer versions of Harvester v1.4.X and v1.5.X
Support for clusterctl generate cluster ...
Improve Status Conditions to reflect current state of Infrastructure
Improve CI (some bugs for release creation)

Goals for HackWeek 2025

FIRST and FOREMOST, any topic is important to you
Add e2e testing
Certify the provider for Rancher Turtles
Add Machine pool labeling
Add PCI-e passthrough capabilities.
Other improvement suggestions are welcome!

Thanks to @isim and Dominic Giebert for their contributions!

Resources

Looking for help from anyone interested in Cluster API (CAPI) or who wants to learn more about Harvester.

This will be an infrastructure provider for Cluster API. Some background reading for the CAPI aspect:

Rancher/k8s Trouble-Maker by tonyhansen

Project Description

When studying for my RHCSA, I found trouble-maker, which is a program that breaks a Linux OS and requires you to fix it. I want to create something similar for Rancher/k8s that can allow for troubleshooting an unknown environment.

Goals for Hackweek 25

Update to modern Rancher and verify that existing tests still work
Change testing logic to populate secrets instead of requiring a secondary script
Add new tests

Goals for Hackweek 24 (Complete)

Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
Create at least 5 modules that can be applied to the cluster and require troubleshooting

Resources

https://github.com/celidon/rancher-troublemaker
https://github.com/rancher/terraform-provider-rancher2
https://github.com/rancher/tf-rancher-up
https://github.com/rancher/quickstart

Rancher Cluster Lifecycle Visualizer by jferraz

Description

Rancher’s v2 provisioning system represents each downstream cluster with several Kubernetes custom resources across multiple API groups, such as clusters.provisioning.cattle.io and clusters.management.cattle.io. Understanding why a cluster is stuck in states like "Provisioning", "Updating", or "Unavailable" often requires jumping between these resources, reading conditions, and correlating them with agent connectivity and known failure modes. This project will build a Cluster Lifecycle Visualizer: a small, read-only controller that runs in the Rancher management cluster and generates a single, human-friendly view per cluster. It will watch Rancher cluster CRDs, derive a simplified lifecycle phase, keep a history of phase transitions from installation time onward, and attach a short, actionable recommendation string that hints at what the operator should check or do next.

Goals

Provide a compact lifecycle summary for each Rancher-managed cluster (e.g. Provisioning, WaitingForClusterAgent, Active, Updating, Error) derived from provisioning.cattle.io/v1 Cluster and management.cattle.io/v3 Cluster status and conditions.
Maintain a phase history for each cluster, allowing operators to see how its state evolved over time since the visualizer was installed.
Attach a recommended action to the current phase using a small ruleset based on common Rancher failure modes (for example, cluster agent not connected, cluster still stabilizing after an upgrade, or generic error states), to improve the day-to-day debugging experience.
Deliver an easy-to-install, read-only component (single YAML or small Helm chart) that Rancher users can deploy to their management cluster and inspect via kubectl get/describe, without UI changes or direct access to downstream clusters.
Use idiomatic Go, wrangler, and Rancher APIs.

Resources

Rancher Manager documentation on RKE2 and K3s cluster configuration and provisioning flows.
Rancher API Go types for provisioning.cattle.io/v1 and management.cattle.io/v3 (from the rancher/rancher repository or published Go packages).
Existing Rancher architecture docs and internal notes about cluster provisioning, cluster agents, and node agents.
A local Rancher management cluster (k3s or RKE2) with a few test downstream clusters to validate phase detection, history tracking, and recommendations.

A CLI for Harvester by mohamed.belgaied

Harvester does not officially come with a CLI tool, the user is supposed to interact with Harvester mostly through the UI. Though it is theoretically possible to use kubectl to interact with Harvester, the manipulation of Kubevirt YAML objects is absolutely not user friendly. Inspired by tools like multipass from Canonical to easily and rapidly create one of multiple VMs, I began the development of Harvester CLI. Currently, it works but Harvester CLI needs some love to be up-to-date with Harvester v1.0.2 and needs some bug fixes and improvements as well.

Project Description

Harvester CLI is a command line interface tool written in Go, designed to simplify interfacing with a Harvester cluster as a user. It is especially useful for testing purposes as you can easily and rapidly create VMs in Harvester by providing a simple command such as: harvester vm create my-vm --count 5 to create 5 VMs named my-vm-01 to my-vm-05.

Harvester CLI is functional but needs a number of improvements: up-to-date functionality with Harvester v1.0.2 (some minor issues right now), modifying the default behaviour to create an opensuse VM instead of an ubuntu VM, solve some bugs, etc.

Github Repo for Harvester CLI: https://github.com/belgaied2/harvester-cli

Done in previous Hackweeks

Create a Github actions pipeline to automatically integrate Harvester CLI to Homebrew repositories: DONE
Automatically package Harvester CLI for OpenSUSE / Redhat RPMs or DEBs: DONE

Goal for this Hackweek

The goal for this Hackweek is to bring Harvester CLI up-to-speed with latest Harvester versions (v1.3.X and v1.4.X), and improve the code quality as well as implement some simple features and bug fixes.

Some nice additions might be: * Improve handling of namespaced objects * Add features, such as network management or Load Balancer creation ? * Add more unit tests and, why not, e2e tests * Improve CI * Improve the overall code quality * Test the program and create issues for it

Issue list is here: https://github.com/belgaied2/harvester-cli/issues

Resources

The project is written in Go, and using client-go the Kubernetes Go Client libraries to communicate with the Harvester API (which is Kubernetes in fact). Welcome contributions are:

Testing it and creating issues
Documentation
Go code improvement

What you might learn

Harvester CLI might be interesting to you if you want to learn more about:

GitHub Actions
Harvester as a SUSE Product
Go programming language
Kubernetes API
Kubevirt API objects (Manipulating VMs and VM Configuration in Kubernetes using Kubevirt)

kubernetes

Cluster API Provider for Harvester by rcase

Project Description

The project has been bootstrapped in HackWeek 23, and its code is available here.

Work done in HackWeek 2023

Have a early working version of the provider available on Rancher Sandbox : *DONE *
Demonstrated the created cluster can be imported using Rancher Turtles: DONE
Stretch goal - demonstrate using the new provider with CAPRKE2: DONE and the templates are available on the repo

DONE in HackWeek 24:

Add more Unit Tests
Improve Status Conditions for some phases
Add cloud provider config generation
Testing with Harvester v1.3.2
Template improvements
Issues creation

DONE in 2025 (out of Hackweek)

Support of ClusterClass
Add to clusterctl community providers, you can add it directly with clusterctl
Testing on newer versions of Harvester v1.4.X and v1.5.X
Support for clusterctl generate cluster ...
Improve Status Conditions to reflect current state of Infrastructure
Improve CI (some bugs for release creation)

Goals for HackWeek 2025

FIRST and FOREMOST, any topic is important to you
Add e2e testing
Certify the provider for Rancher Turtles
Add Machine pool labeling
Add PCI-e passthrough capabilities.
Other improvement suggestions are welcome!

Thanks to @isim and Dominic Giebert for their contributions!

Resources

Looking for help from anyone interested in Cluster API (CAPI) or who wants to learn more about Harvester.

This will be an infrastructure provider for Cluster API. Some background reading for the CAPI aspect:

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it.

The Plan

Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:

❥ The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

❥ The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

❥ Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.

If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

Why?

The CONCLUSION!!!

A State of the Union document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below!

A CLI for Harvester by mohamed.belgaied

Project Description

Github Repo for Harvester CLI: https://github.com/belgaied2/harvester-cli

Done in previous Hackweeks

Create a Github actions pipeline to automatically integrate Harvester CLI to Homebrew repositories: DONE
Automatically package Harvester CLI for OpenSUSE / Redhat RPMs or DEBs: DONE

Goal for this Hackweek

Issue list is here: https://github.com/belgaied2/harvester-cli/issues

Resources

The project is written in Go, and using client-go the Kubernetes Go Client libraries to communicate with the Harvester API (which is Kubernetes in fact). Welcome contributions are:

Testing it and creating issues
Documentation
Go code improvement

What you might learn

Harvester CLI might be interesting to you if you want to learn more about:

GitHub Actions
Harvester as a SUSE Product
Go programming language
Kubernetes API
Kubevirt API objects (Manipulating VMs and VM Configuration in Kubernetes using Kubevirt)

Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0

Self-Scaling LLM Infrastructure Powered by Rancher

Description

The Problem

Running LLMs can get expensive and complex pretty quickly.

Today there are typically two choices:

Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.

What if there was a middle ground?

What if infrastructure scaled itself instead of making you scale it?

Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?

Project Repository: github.com/alexander-demicev/llmserverless

What This Project Does

A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.

A complete, self-scaling LLM infrastructure that:

Scales to zero when idle (no idle costs)
Scales up automatically when requests come in
Adds more nodes when needed, removes them when demand drops
Runs on any infrastructure - laptop, bare metal, or cloud

Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.

How It Works

A combination of open source tools working together:

Flow:

Users interact with OpenWebUI (chat interface)
Requests go to LiteLLM Gateway
LiteLLM routes requests to:
- Ollama (Knative) for local model inference (auto-scales pods)
- Or cloud APIs for fallback

Rancher/k8s Trouble-Maker by tonyhansen

Project Description

Goals for Hackweek 25

Update to modern Rancher and verify that existing tests still work
Change testing logic to populate secrets instead of requiring a secondary script
Add new tests

Goals for Hackweek 24 (Complete)

Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
Create at least 5 modules that can be applied to the cluster and require troubleshooting

Resources

https://github.com/celidon/rancher-troublemaker
https://github.com/rancher/terraform-provider-rancher2
https://github.com/rancher/tf-rancher-up
https://github.com/rancher/quickstart

Description

Goals

Resources

Core concepts:

Looking for hackers with the skills:

This project is part of:

Activity

Comments

about 1 year ago by aruiz | Reply

Rough Plan

about 1 year ago by aruiz | Reply

about 1 year ago by aruiz | Reply

12 months ago by Jasbon | Reply

Similar Projects

rancher

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

The Plan

Why?

The CONCLUSION!!!

Cluster API Provider for Harvester by rcase

Project Description

Work done in HackWeek 2023

DONE in HackWeek 24:

DONE in 2025 (out of Hackweek)

Goals for HackWeek 2025

Resources

Rancher/k8s Trouble-Maker by tonyhansen

Project Description

Goals for Hackweek 25

Goals for Hackweek 24 (Complete)

Resources

Rancher Cluster Lifecycle Visualizer by jferraz

Description

Goals

Resources

A CLI for Harvester by mohamed.belgaied

Project Description

Done in previous Hackweeks

Goal for this Hackweek

Resources

What you might learn

kubernetes

Cluster API Provider for Harvester by rcase

Project Description

Work done in HackWeek 2023

DONE in HackWeek 24:

DONE in 2025 (out of Hackweek)

Goals for HackWeek 2025

Resources

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

The Plan

Why?

The CONCLUSION!!!

A CLI for Harvester by mohamed.belgaied

Project Description

Done in previous Hackweeks

Goal for this Hackweek

Resources

What you might learn

Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0

Self-Scaling LLM Infrastructure Powered by Rancher

Description

The Problem

What This Project Does

How It Works

Rancher/k8s Trouble-Maker by tonyhansen

Project Description

Goals for Hackweek 25

Goals for Hackweek 24 (Complete)

Resources