Project Description

The goal is to have a language model, that is able to answer technical questions on Uyuni. Uyuni documentation is too large for in-context processing, so finetuning is the way to go.

Goal for this Hackweek

Finetune a model based on llama-2-7b.

Resources

github repo

Looking for hackers with the skills:

ai uyuni

This project is part of:

Hack Week 23

Activity

  • over 2 years ago: nadvornik added keyword "ai" to this project.
  • over 2 years ago: nadvornik added keyword "uyuni" to this project.
  • over 2 years ago: nadvornik originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Song Search with CLAP by gcolangiuli

    Description

    Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

    SUSE Hackweek AI Song Search

    Goals

    Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

    • Music Tagging;
    • Free text search;
    • Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

    The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

    Result

    In this MVP we implemented:

    • Async Song Analysis with Clap model
    • Free Text Search of the songs
    • Similar song search based on vector representation
    • Containerised version with web interface

    We also documented what went well and what can be improved in the use of AI.

    You can have a look at the result here:

    Future implementation can be related to performance improvement and stability of the analysis.

    References


    Is SUSE Trending? Popularity and Developer Sentiment Insight Using Native AI Capabilities by terezacerna

    Description

    This project aims to explore the popularity and developer sentiment around SUSE and its technologies compared to Red Hat and their technologies. Using publicly available data sources, I will analyze search trends, developer preferences, repository activity, and media presence. The final outcome will be an interactive Power BI dashboard that provides insights into how SUSE is perceived and discussed across the web and among developers.

    Goals

    1. Assess the popularity of SUSE products and brand compared to Red Hat using Google Trends.
    2. Analyze developer satisfaction and usage trends from the Stack Overflow Developer Survey.
    3. Use the GitHub API to compare SUSE and Red Hat repositories in terms of stars, forks, contributors, and issue activity.
    4. Perform sentiment analysis on GitHub issue comments to measure community tone and engagement using built-in Copilot capabilities.
    5. Perform sentiment analysis on Reddit comments related to SUSE technologies using built-in Copilot capabilities.
    6. Use Gnews.io to track and compare the volume of news articles mentioning SUSE and Red Hat technologies.
    7. Test the integration of Copilot (AI) within Power BI for enhanced data analysis and visualization.
    8. Deliver a comprehensive Power BI report summarizing findings and insights.
    9. Test the full potential of Power BI, including its AI features and native language Q&A.

    Resources

    1. Google Trends: Web scraping for search popularity data
    2. Stack Overflow Developer Survey: For technology popularity and satisfaction comparison
    3. GitHub API: For repository data (stars, forks, contributors, issues, comments).
    4. Gnews.io API: For article volume and mentions analysis.
    5. Reddit: SUSE related topics with comments.


    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

    Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it. A GitHub robot mascot trying to lasso a blue bull with a Kubernetes logo tatooed on it


    The Plan

    Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!

    Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:


    The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

    The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

    Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.


    If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

    Why?

    We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.

    The CONCLUSION!!!

    A add-emoji State of the Union add-emoji document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below! add-emoji


    SUSE Observability MCP server by drutigliano

    Description

    The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

    This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

    Goals

    • Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
    • Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
    • Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
    • Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

     Hackweek STEP

    • Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

     Scope

    • Implement read-only MCP server that can:
      • Connect to a live SUSE Observability instance and authenticate (with API token)
      • Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
      • Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
      • Return the data as a structured JSON payload compliant with the MCP specification.

    Deliverables

    • MCP Server v0.1 A running Golang MCP server with at least one tool.
    • A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

    Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

    Resources

    • https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
    • https://www.datadoghq.com/blog/datadog-remote-mcp-server
    • https://modelcontextprotocol.io/specification/2025-06-18/index
    • https://modelcontextprotocol.io/docs/develop/build-server

     Basic implementation

    • https://github.com/drutigliano19/suse-observability-mcp-server

    Results

    Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.

    Example execution


    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    Set Uyuni to manage edge clusters at scale by RDiasMateus

    Description

    Prepare a Poc on how to use MLM to manage edge clusters. Those cluster are normally equal across each location, and we have a large number of them.

    The goal is to produce a set of sets/best practices/scripts to help users manage this kind of setup.

    Goals

    step 1: Manual set-up

    Goal: Have a running application in k3s and be able to update it using System Update Controler (SUC)

    • Deploy Micro 6.2 machine
    • Deploy k3s - single node

      • https://docs.k3s.io/quick-start
    • Build/find a simple web application (static page)

      • Build/find a helmchart to deploy the application
    • Deploy the application on the k3s cluster

    • Install App updates through helm update

    • Install OS updates using MLM

    step 2: Automate day 1

    Goal: Trigger the application deployment and update from MLM

    • Salt states For application (with static data)
      • Deploy the application helmchart, if not present
      • install app updates through helmchart parameters
    • Link it to GIT
      • Define how to link the state to the machines (based in some pillar data? Using configuration channels by importing the state? Naming convention?)
      • Use git update to trigger helmchart app update
    • Recurrent state applying configuration channel?

    step 3: Multi-node cluster

    Goal: Use SUC to update a multi-node cluster.

    • Create a multi-node cluster
    • Deploy application
      • call the helm update/install only on control plane?
    • Install App updates through helm update
    • Prepare a SUC for OS update (k3s also? How?)
      • https://github.com/rancher/system-upgrade-controller
      • https://documentation.suse.com/cloudnative/k3s/latest/en/upgrades/automated.html
      • Update/deploy the SUC?
      • Update/deploy the SUC CRD with the update procedure


    Ansible to Salt integration by vizhestkov

    Description

    We already have initial integration of Ansible in Salt with the possibility to run playbooks from the salt-master on the salt-minion used as an Ansible Control node.

    In this project I want to check if it possible to make Ansible working on the transport of Salt. Basically run playbooks with Ansible through existing established Salt (ZeroMQ) transport and not using ssh at all.

    It could be a good solution for the end users to reuse Ansible playbooks or run Ansible modules they got used to with no effort of complex configuration with existing Salt (or Uyuni/SUSE Multi Linux Manager) infrastructure.

    Goals

    • [v] Prepare the testing environment with Salt and Ansible installed
    • [v] Discover Ansible codebase to figure out possible ways of integration
    • [v] Create Salt/Uyuni inventory module
    • [v] Make basic modules to work with no using separate ssh connection, but reusing existing Salt connection
    • [v] Test some most basic playbooks

    Resources

    GitHub page

    Video of the demo


    Enable more features in mcp-server-uyuni by j_renner

    Description

    I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

    • System operations and infos
    • System groups
    • Maintenance windows
    • Ansible
    • Reporting
    • ...

    At the end of the week I managed to enable basic system group operations:

    • List all system groups visible to the user
    • Create new system groups
    • List systems assigned to a group
    • Add and remove systems from groups

    Goals

    • Set up test environment locally with the MCP server and client + a recent MLM server [DONE]
    • Identify features and use cases offering a benefit with limited effort required for enablement [DONE]
    • Create a PR to the repo [DONE]

    Resources


    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources

    Outcome