Currently, when Rancher tries to provision a Kubernetes cluster on vSphere, it needs to initiate API calls to the vSphere endpoint. In a hybrid cloud environment this often means that the Rancher server is not in the same network as the vSphere endpoint. Therefore inbound access is required to be added to a firewall so Rancher can reach the vSphere system. This naturally poses a security concern and creates administrative burden on our users who have to go through a security review to get this approved.
If instead of requiring direct API access, an agent could exist inside the network where the vSphere API lived, then this agent could broker the communication between the Rancher server and the downstream API. The agent would simply initiate an outbound API connection to the Rancher server (much like any node agent or cluster agent currently) and simultaneously proxy any API calls that Rancher needs to make to vSphere. This would also have the benefit of being able to be run through a HTTP proxy, which many security teams will appreciate as a less risky connectivity model.
No Hackers yet
This project is part of:
Hack Week 20
Activity
Comments
Be the first to comment!
Similar Projects
Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0
Self-Scaling LLM Infrastructure Powered by Rancher

Description
The Problem
Running LLMs can get expensive and complex pretty quickly.
Today there are typically two choices:
- Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
- Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.
What if there was a middle ground?
What if infrastructure scaled itself instead of making you scale it?
Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?
Project Repository: github.com/alexander-demicev/llmserverless
What This Project Does
A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.
A complete, self-scaling LLM infrastructure that:
- Scales to zero when idle (no idle costs)
- Scales up automatically when requests come in
- Adds more nodes when needed, removes them when demand drops
- Runs on any infrastructure - laptop, bare metal, or cloud
Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.
How It Works
A combination of open source tools working together:
Flow:
- Users interact with OpenWebUI (chat interface)
- Requests go to LiteLLM Gateway
- LiteLLM routes requests to:
- Ollama (Knative) for local model inference (auto-scales pods)
- Or cloud APIs for fallback
The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio
Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it. 
The Plan
Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!
Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:
❥ The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.
❥ The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.
❥ Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.
If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.
Why?
We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.
The CONCLUSION!!!
A
State of the Union
document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below!
Rancher/k8s Trouble-Maker by tonyhansen
Project Description
When studying for my RHCSA, I found trouble-maker, which is a program that breaks a Linux OS and requires you to fix it. I want to create something similar for Rancher/k8s that can allow for troubleshooting an unknown environment.
Goals for Hackweek 25
- Update to modern Rancher and verify that existing tests still work
- Change testing logic to populate secrets instead of requiring a secondary script
- Add new tests
Goals for Hackweek 24 (Complete)
- Create a basic framework for creating Rancher/k8s cluster lab environments as needed for the Break/Fix
- Create at least 5 modules that can be applied to the cluster and require troubleshooting
Resources
- https://github.com/celidon/rancher-troublemaker
- https://github.com/rancher/terraform-provider-rancher2
- https://github.com/rancher/tf-rancher-up
- https://github.com/rancher/quickstart
Liz - Prompt autocomplete by ftorchia
Description
Liz is the Rancher AI assistant for cluster operations.
Goals
We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.
Example:
- User prompt: "Can you show me the list of p"
- Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"
Example:
- User prompt: "Show me the logs of #rancher-"
- Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".
Technical Overview
- The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
- The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.
Resources
Cluster API Provider for Harvester by rcase
Project Description
The Cluster API "infrastructure provider" for Harvester, also named CAPHV, makes it possible to use Harvester with Cluster API. This enables people and organisations to create Kubernetes clusters running on VMs created by Harvester using a declarative spec.
The project has been bootstrapped in HackWeek 23, and its code is available here.
Work done in HackWeek 2023
- Have a early working version of the provider available on Rancher Sandbox : *DONE *
- Demonstrated the created cluster can be imported using Rancher Turtles: DONE
- Stretch goal - demonstrate using the new provider with CAPRKE2: DONE and the templates are available on the repo
DONE in HackWeek 24:
- Add more Unit Tests
- Improve Status Conditions for some phases
- Add cloud provider config generation
- Testing with Harvester v1.3.2
- Template improvements
- Issues creation
DONE in 2025 (out of Hackweek)
- Support of ClusterClass
- Add to
clusterctlcommunity providers, you can add it directly withclusterctl - Testing on newer versions of Harvester v1.4.X and v1.5.X
- Support for
clusterctl generate cluster ... - Improve Status Conditions to reflect current state of Infrastructure
- Improve CI (some bugs for release creation)
Goals for HackWeek 2025
- FIRST and FOREMOST, any topic is important to you
- Add e2e testing
- Certify the provider for Rancher Turtles
- Add Machine pool labeling
- Add PCI-e passthrough capabilities.
- Other improvement suggestions are welcome!
Thanks to @isim and Dominic Giebert for their contributions!
Resources
Looking for help from anyone interested in Cluster API (CAPI) or who wants to learn more about Harvester.
This will be an infrastructure provider for Cluster API. Some background reading for the CAPI aspect:
openQA log viewer by mpagot
Description
*** Warning: Are You at Risk for VOMIT? ***
Do you find yourself staring at a screen, your eyes glossing over as thousands of lines of text scroll by? Do you feel a wave of text-based nausea when someone asks you to "just check the logs"?
You may be suffering from VOMIT (Verbose Output Mental Irritation Toxicity).
This dangerous, work-induced ailment is triggered by exposure to an overwhelming quantity of log data, especially from parallel systems. The human brain, not designed to mentally process 12 simultaneous autoinst-log.txt files, enters a state of toxic shock. It rejects the "Verbose Output," making it impossible to find the one critical error line buried in a 50,000-line sea of "INFO: doing a thing."
Before you're forced to rm -rf /var/log in a fit of desperation, we present the digital antacid.
No panic: we have The openQA Log Visualizer
This is the UI antidote for handling toxic log environments. It bravely dives into the chaotic, multi-machine mess of your openQA test runs, finds all the related, verbose logs, and force-feeds them into a parser.
Goals
Work on the existing POC openqa-log-visualizer about few specific tasks:
- add support for more type of logs
- extend the configuration file syntax beyond the actual one
- work on log parsing performance
Find some beta-tester and collect feedback and ideas about features
If time allow for it evaluate other UI frameworks and solutions (something more simple to distribute and run, maybe more low level to gain in performance).
Resources
HTTP API for nftables by crameleon
Background
The idea originated in https://progress.opensuse.org/issues/164060 and is about building RESTful API which translates authorized HTTP requests to operations in nftables, possibly utilizing libnftables-json(5).
Originally, I started developing such an interface in Go, utilizing https://github.com/google/nftables. The conversion of string networks to nftables set elements was problematic (unfortunately no record of details), and I started a second attempt in Python, which made interaction much simpler thanks to native nftables Python bindings.
Goals
- Find and track the issue with google/nftables
- Revisit and polish the Go or Python code (prefer Go, but possibly depends on implementing missing functionality), primarily the server component
- Finish functionality to interact with nftables sets (retrieving and updating elements), which are of interest for the originating issue
- Align test suite
- Packaging
Resources
- https://git.netfilter.org/nftables/tree/py/src/nftables.py
- https://git.com.de/Georg/nftables-http-api (to be moved to GitHub)
- https://build.opensuse.org/package/show/home:crameleon:containers/pytest-nftables-container
Results
- Started new https://github.com/tacerus/nftables-http-api.
- First Go nftables issue was related to set elements needing to be added with different start and end addresses - coincidentally, this was recently discovered by someone else, who added a useful helper function for this: https://github.com/google/nftables/pull/342.
- Further improvements submitted: https://github.com/google/nftables/pull/347.
Side results
Upon starting to unify the structure and implementing more functionality, missing JSON output support was noticed for some subcommands in libnftables. Submitted patches here as well:
- https://lore.kernel.org/netfilter-devel/20251203131736.4036382-2-georg@syscid.com/T/#u