Description

Build a solid understanding of the current landscape of Artificial Intelligence and how modern cloud-native technologies—especially Kubernetes—support AI workloads.

Goals

Use Gemini Learning Mode to guide the exploration, surface relevant concepts, and structure the learning journey:

Gain insight into the latest AI trends, tools, and architectural concepts.
Understand how Kubernetes and related cloud-native technologies are used in the AI ecosystem (model training, deployment, orchestration, MLOps).

Resources

Red Hat AI Topic Articles
- https://www.redhat.com/en/topics/ai
Kubeflow Documentation
- https://www.kubeflow.org/docs/
Q4 2025 CNCF Technology Landscape Radar report:
- https://www.cncf.io/announcements/2025/11/11/cncf-and-slashdata-report-finds-leading-ai-tools-gaining-adoption-in-cloud-native-ecosystems/
- https://www.cncf.io/wp-content/uploads/2025/11/cncfreporttechradar_111025a.pdf
Agent-to-Agent (A2A) Protocol
- https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/

Looking for hackers with the skills:

ai aiops kubernetes mlops

This project is part of:

Hack Week 25

Activity

about 2 months ago: horon disliked this project.

about 2 months ago: horon liked this project.

2 months ago: jluo added keyword "ai" to this project.

2 months ago: jluo added keyword "aiops" to this project.

2 months ago: jluo added keyword "kubernetes" to this project.

2 months ago: jluo added keyword "mlops" to this project.

2 months ago: jluo started this project.

2 months ago: jluo originated this project.

Comments

2 months ago by jluo | Reply

A great summary from Gemini:

This is a rapidly expanding ecosystem. To keep it organized, I will break this list down by Lifecycle Stage (Training vs. Serving) and Infrastructure Layer (Compute vs. Data).

1. AI Platforms & Orchestration (The "Command Center")

These tools manage the end-to-end lifecycle, gluing everything else together.
- Kubeflow: The "Grandfather" of AI on K8s. It’s a massive suite including:
  - Kubeflow Pipelines: For building repeatable workflows (Data -> Train -> Deploy).
  - Kubeflow Notebooks: Spawns Jupyter servers as Pods for data scientists.
  - Katib: Automated hyperparameter tuning (finding the best learning rate).
- Ray (KubeRay): The top challenger to Kubeflow. It allows you to write Python code that scales across a cluster instantly. It is excellent for both distributed training and serving.
- ZenML: An MLOps framework that sits above the infrastructure, letting you define pipelines in code that can run on Kubeflow, Ray, or simple Kubernetes batches.
2. Training & Scheduling (The "Heavy Lifters")

Standard Kubernetes scheduling (FIFO) is bad for AI training. These tools fix that.
- Volcano: A batch scheduler. It ensures "Gang Scheduling"—meaning if a job needs 50 GPUs but only 49 are available, it waits. (Standard K8s would start 49 and let them sit idle, wasting money).
- Kueue: A newer, lighter alternative to Volcano managed by the K8s specialized interest group. It manages "Job Queues" natively.
- Training Operator: A unified K8s operator that lets you run PyTorchJob, TFJob (TensorFlow), and XGBoostJob as native K8s objects.
3. Inference & Serving (The "Waiter")

Once a model is trained, these tools serve it to users.
- KServe: The industry standard. It handles "Scale-to-Zero" (via KEDA), canary rollouts, and provides a unified API for TensorFlow, PyTorch, and ONNX models.
- vLLM: The current king of LLM serving. It is highly optimized for GPU memory (PagedAttention) and is often run inside KServe or as a standalone Deployment.
- BentoML / Yatai: A developer-friendly framework. You package your model as a "Bento" (standard format), and Yatai orchestrates the deployment on K8s.
- Seldon Core: An enterprise-grade alternative to KServe with advanced features for compliance, audit trails, and complex inference graphs.
4. Agentic & LLM Ops (The "New Wave")

Tools specifically for the 2025 era of Autonomous Agents.
- LangFlow / Flowise: Low-code "drag-and-drop" UI tools for building LLM chains. They can be deployed on K8s via Helm charts to run agent backends.
- kagent / Agent Sandbox: Emerging tools (often cloud-specific or experimental) that provide secure, isolated environments (using gVisor or microVMs) for agents to execute code safely.
- Ollama: While often used locally, it is increasingly deployed on K8s (via Helm) as a lightweight way to serve open-source models like Llama 3 or Mistral inside a cluster.
5. Data & Memory (The "Brain")
- Vector Databases (with K8s Operators):
  - Milvus: A popular open-source vector DB built natively for K8s scalability.
  - Weaviate: Another strong option with a solid K8s operator.
  - Qdrant: Written in Rust, very fast, and easy to deploy on K8s.
- Feature Stores:
  - Feast: The open-source standard for serving features (e.g., "User's last 5 clicks") to models in real-time.
6. Observability & Cost (The "Watchtower")
- Prometheus & Grafana: The standard for metrics (GPU temperature, Request Latency).
- DCGM Exporter: The specific NVIDIA tool that pulls GPU metrics (utilization, memory) so Prometheus can see them.
- KEDA: The autoscaler (discussed previously) that scales pods based on event queues.
- Karpenter: The Node autoscaler. If KEDA asks for more pods, Karpenter instantly buys more EC2/VM nodes from the cloud provider to fit them.
- OpenCost / Kubecost: Tools to track exactly how much money your AI team is spending on GPUs per namespace.

2 months ago by jluo | Reply

Interact with GitHub Copilot to evaluate its capability for assisting with daily work.

For the result PR, see https://github.com/rancher/rancher/pull/52943

Similar Projects

ai

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

Goals

By the end of Hack Week, we aim to have a single, working Python script that:

Connects to Prometheus and executes a query to fetch detailed test failure history.
Processes the raw data into a format suitable for the Gemini API.
Successfully calls the Gemini API with the data and a clear prompt.
Parses the AI's response to extract a simple list of flaky tests.
Saves the list to a JSON file that can be displayed in Grafana.
New panel in our Dashboard listing the Flaky tests

Resources

Jenkins Prometheus Exporter: https://github.com/uyuni-project/jenkins-exporter/
Data Source: Our internal Prometheus server.
Key Metric: jenkins_build_test_case_failure_age{jobname, buildid, suite, case, status, failedsince}.
Existing Query for Reference: count by (suite) (max_over_time(jenkins_build_test_case_failure_age{status=~"FAILED|REGRESSION", jobname="$jobname"}[$__range])).
AI Model: The Google Gemini API.
Example about how to interact with Gemini API: https://github.com/srbarrios/FailTale/
Visualization: Our internal Grafana Dashboard.
Internal IaC: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring

Outcome

Background Coding Agent by mmanno

Description

I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.

Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.

So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.

Can we make prompts and workflows smart enough to succeed at background coding?

Goals

We will build a platform that allows engineers to execute complex code transformations using prompts.

By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.

Our platform will consist of three main components:

"Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
- The target repositories.
- A wrapper to run a "coding agent", e.g., "gemini-cli".
- The task as a natural language prompt.
"Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.

MVP

Define the Change manifest format.
Build the core Management Service that can accept and queue a Change.
Connect management service and runners, dynamically dispatch jobs to runners.
Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.

Stretch Goals:

Multi-layered approach, Workflow Agents trigger Coding Agents:
1. Workflow Agent: Gather information about the task interactively from the user.
2. Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
Use MCP:
1. Workflow Agent gathers context information from Slack, Github, etc.
2. Workflow Agent triggers a Coding Agent.
Create a "Standard Task" library with reliable prompts.
1. Rebasing rancher-monitoring to a new version of kube-prom-stack
2. Update charts to use new images
3. Apply changes to comply with a new linter
4. Bump complex Go dependencies, like k8s modules
5. Backport pull requests to other branches
Add “review agents” that review the generated PR.

Description

This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

Goals

Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
Evaluate support for local LLMs and external APIs through the plugin.
Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

Resources

Grafana LMM plug-in

Uyuni Health-check

MCP Trace Suite by r1chard-lyu

Description

This project plans to create an MCP Trace Suite, a system that consolidates commonly used Linux debugging tools such as bpftrace, perf, and ftrace.

The suite is implemented as an MCP Server. This architecture allows an AI agent to leverage the server to diagnose Linux issues and perform targeted system debugging by remotely executing and retrieving tracing data from these powerful tools.

Repo: https://github.com/r1chard-lyu/systracesuite
Demo: Slides

Goals

Build an MCP Server that can integrate various Linux debugging and tracing tools, including bpftrace, perf, ftrace, strace, and others, with support for future expansion of additional tools.
Perform testing by intentionally creating bugs or issues that impact system performance, allowing an AI agent to analyze the root cause and identify the underlying problem.

Resources

Gemini CLI: https://geminicli.com/
eBPF: https://ebpf.io/
bpftrace: https://github.com/bpftrace/bpftrace/
perf: https://perfwiki.github.io/main/
ftrace: https://github.com/r1chard-lyu/tracium/

Enable more features in mcp-server-uyuni by j_renner

Description

I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

At the end of the week I managed to enable basic system group operations:

List all system groups visible to the user
Create new system groups
List systems assigned to a group
Add and remove systems from groups

Goals

Set up test environment locally with the MCP server and client + a recent MLM server [DONE]
Identify features and use cases offering a benefit with limited effort required for enablement [DONE]
Create a PR to the repo [DONE]

Resources

aiops

SUSE Observability MCP server by drutigliano

Description

The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

Goals

Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

Hackweek STEP

Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

Scope

Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.

Deliverables

MCP Server v0.1 A running Golang MCP server with at least one tool.
A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

Resources

https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
https://www.datadoghq.com/blog/datadog-remote-mcp-server
https://modelcontextprotocol.io/specification/2025-06-18/index
https://modelcontextprotocol.io/docs/develop/build-server

Basic implementation

https://github.com/drutigliano19/suse-observability-mcp-server

Results

Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.

Example execution

Bugzilla goes AI - Phase 1 by nwalter

Description

This project, Bugzilla goes AI, aims to boost developer productivity by creating an autonomous AI bug agent during Hackweek. The primary goal is to reduce the time employees spend triaging bugs by integrating Ollama to summarize issues, recommend next steps, and push focused daily reports to a Web Interface.

Goals

To reduce employee time spent on Bugzilla by implementing an AI tool that triages and summarizes bug reports, providing actionable recommendations to the team via Web Interface.

Project Charter

Bugzilla goes AI Phase 1

Description

Project Achievements during Hackweek

In this file you can read about what we achieved during Hackweek.

Project Achievements

Explore LLM evaluation metrics by thbertoldi

Description

Learn the best practices for evaluating LLM performance with an open-source framework such as DeepEval.

Goals

Curate the knowledge learned during practice and present it to colleagues.

-> Maybe publish a blog post on SUSE's blog?

Resources

https://deepeval.com

https://docs.pactflow.io/docs/bi-directional-contract-testing

kubernetes

Preparing KubeVirtBMC for project transfer to the KubeVirt organization by zchang

Description

KubeVirtBMC is preparing to transfer the project to the KubeVirt organization. One requirement is to enhance the modeling design's security. The current v1alpha1 API (the VirtualMachineBMC CRD) was designed during the proof-of-concept stage. It's immature and inherently insecure due to its cross-namespace object references, exposing security concerns from an RBAC perspective.

The other long-awaited feature is the ability to mount virtual media so that virtual machines can boot from remote ISO images.

Goals

Deliver the v1beta1 API and its corresponding controller implementation
Enable the Redfish virtual media mount function for KubeVirt virtual machines

Resources

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it.

The Plan

Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!

Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:

❥ The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

❥ The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

❥ Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.

If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

Why?

We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.

The CONCLUSION!!!

A State of the Union document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below!

A CLI for Harvester by mohamed.belgaied

Harvester does not officially come with a CLI tool, the user is supposed to interact with Harvester mostly through the UI. Though it is theoretically possible to use kubectl to interact with Harvester, the manipulation of Kubevirt YAML objects is absolutely not user friendly. Inspired by tools like multipass from Canonical to easily and rapidly create one of multiple VMs, I began the development of Harvester CLI. Currently, it works but Harvester CLI needs some love to be up-to-date with Harvester v1.0.2 and needs some bug fixes and improvements as well.

Project Description

Harvester CLI is a command line interface tool written in Go, designed to simplify interfacing with a Harvester cluster as a user. It is especially useful for testing purposes as you can easily and rapidly create VMs in Harvester by providing a simple command such as: harvester vm create my-vm --count 5 to create 5 VMs named my-vm-01 to my-vm-05.

Harvester CLI is functional but needs a number of improvements: up-to-date functionality with Harvester v1.0.2 (some minor issues right now), modifying the default behaviour to create an opensuse VM instead of an ubuntu VM, solve some bugs, etc.

Github Repo for Harvester CLI: https://github.com/belgaied2/harvester-cli

Done in previous Hackweeks

Create a Github actions pipeline to automatically integrate Harvester CLI to Homebrew repositories: DONE
Automatically package Harvester CLI for OpenSUSE / Redhat RPMs or DEBs: DONE

Goal for this Hackweek

The goal for this Hackweek is to bring Harvester CLI up-to-speed with latest Harvester versions (v1.3.X and v1.4.X), and improve the code quality as well as implement some simple features and bug fixes.

Some nice additions might be: * Improve handling of namespaced objects * Add features, such as network management or Load Balancer creation ? * Add more unit tests and, why not, e2e tests * Improve CI * Improve the overall code quality * Test the program and create issues for it

Issue list is here: https://github.com/belgaied2/harvester-cli/issues

Resources

The project is written in Go, and using client-go the Kubernetes Go Client libraries to communicate with the Harvester API (which is Kubernetes in fact). Welcome contributions are:

Testing it and creating issues
Documentation
Go code improvement

What you might learn

Harvester CLI might be interesting to you if you want to learn more about:

GitHub Actions
Harvester as a SUSE Product
Go programming language
Kubernetes API
Kubevirt API objects (Manipulating VMs and VM Configuration in Kubernetes using Kubevirt)

Cluster API Provider for Harvester by rcase

Project Description

The Cluster API "infrastructure provider" for Harvester, also named CAPHV, makes it possible to use Harvester with Cluster API. This enables people and organisations to create Kubernetes clusters running on VMs created by Harvester using a declarative spec.

The project has been bootstrapped in HackWeek 23, and its code is available here.

Work done in HackWeek 2023

Have a early working version of the provider available on Rancher Sandbox : *DONE *
Demonstrated the created cluster can be imported using Rancher Turtles: DONE
Stretch goal - demonstrate using the new provider with CAPRKE2: DONE and the templates are available on the repo

DONE in HackWeek 24:

Add more Unit Tests
Improve Status Conditions for some phases
Add cloud provider config generation
Testing with Harvester v1.3.2
Template improvements
Issues creation

DONE in 2025 (out of Hackweek)

Support of ClusterClass
Add to clusterctl community providers, you can add it directly with clusterctl
Testing on newer versions of Harvester v1.4.X and v1.5.X
Support for clusterctl generate cluster ...
Improve Status Conditions to reflect current state of Infrastructure
Improve CI (some bugs for release creation)

Goals for HackWeek 2025

FIRST and FOREMOST, any topic is important to you
Add e2e testing
Certify the provider for Rancher Turtles
Add Machine pool labeling
Add PCI-e passthrough capabilities.
Other improvement suggestions are welcome!

Thanks to @isim and Dominic Giebert for their contributions!

Resources

Looking for help from anyone interested in Cluster API (CAPI) or who wants to learn more about Harvester.

This will be an infrastructure provider for Cluster API. Some background reading for the CAPI aspect:

OpenPlatform Self-Service Portal by tmuntan1

Description

In SUSE IT, we developed an internal developer platform for our engineers using SUSE technologies such as RKE2, SUSE Virtualization, and Rancher. While it works well for our existing users, the onboarding process could be better.

To improve our customer experience, I would like to build a self-service portal to make it easy for people to accomplish common actions. To get started, I would have the portal create Jira SD tickets for our customers to have better information in our tickets, but eventually I want to add automation to reduce our workload.

Goals

Build a frontend website (Angular) that helps customers create Jira SD tickets.
Build a backend (Rust with Axum) for the backend, which would do all the hard work for the frontend.

Resources (SUSE VPN only)

development site: https://ui-dev.openplatform.suse.com/login?returnUrl=%2Fopenplatform%2Fforms
https://gitlab.suse.de/itpe/core/open-platform/op-portal/backend
https://gitlab.suse.de/itpe/core/open-platform/op-portal/frontend

mlops

Kubernetes-Based ML Lifecycle Automation by lmiranda

Description

This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.

The pipeline will automate the lifecycle of a machine learning model, including:

Data ingestion/collection
Model training as a Kubernetes Job
Model artifact storage in an S3-compatible registry (e.g. Minio)
A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
A lightweight inference service that loads and serves the latest model
Monitoring of model performance and service health through Prometheus/Grafana

The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.

Goals

By the end of Hack Week, the project should:

Produce a fully functional ML pipeline running on Kubernetes with:
- Data collection job
- Training job container
- Storage and versioning of trained models
- Automated deployment of new model versions
- Model inference API service
- Basic monitoring dashboards
Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.
Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).
Prepare a short demo explaining the end-to-end process and how new models flow through the system.

Resources

Project Repository

Updates

Training pipeline and datasets
Inference Service py

Description

Goals

Resources

Looking for hackers with the skills:

This project is part of:

Activity

Comments

2 months ago by jluo | Reply

1. AI Platforms & Orchestration (The "Command Center")

2. Training & Scheduling (The "Heavy Lifters")

3. Inference & Serving (The "Waiter")

4. Agentic & LLM Ops (The "New Wave")

5. Data & Memory (The "Brain")

6. Observability & Cost (The "Watchtower")

2 months ago by jluo | Reply

Similar Projects

ai

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Goals

Resources

Outcome

Background Coding Agent by mmanno

Description

Goals

MVP

Stretch Goals:

See also

Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

Description

Goals

Resources

MCP Trace Suite by r1chard-lyu

Description

Goals

Resources

Enable more features in mcp-server-uyuni by j_renner

Description

Goals

Resources

aiops

SUSE Observability MCP server by drutigliano

Description

Goals

Hackweek STEP

Scope

Deliverables

Resources

Basic implementation

Results

Example execution

Bugzilla goes AI - Phase 1 by nwalter

Description

Goals

Project Charter

Description

Project Achievements during Hackweek

Explore LLM evaluation metrics by thbertoldi

Description

Goals

Resources

kubernetes

Preparing KubeVirtBMC for project transfer to the KubeVirt organization by zchang

The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

The Plan

Why?

The CONCLUSION!!!

A CLI for Harvester by mohamed.belgaied

Project Description

Done in previous Hackweeks

Goal for this Hackweek

Resources

What you might learn

Cluster API Provider for Harvester by rcase

Project Description

Work done in HackWeek 2023

DONE in HackWeek 24:

DONE in 2025 (out of Hackweek)

Goals for HackWeek 2025

Resources