Description

Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

Goals

By the end of Hack Week, we aim to have a single, working Python script that:

  1. Connects to Prometheus and executes a query to fetch detailed test failure history.
  2. Processes the raw data into a format suitable for the Gemini API.
  3. Successfully calls the Gemini API with the data and a clear prompt.
  4. Parses the AI's response to extract a simple list of flaky tests.
  5. Saves the list to a JSON file that can be displayed in Grafana.
  6. New panel in our Dashboard listing the Flaky tests

Resources

Outcome

Looking for hackers with the skills:

uyuni prometheus grafana ai completed

This project is part of:

Hack Week 25

Activity

  • 23 days ago: jordimassaguerpla liked this project.
  • 30 days ago: oscar-barrios added keyword "completed" to this project.
  • about 1 month ago: deneb_alpha liked this project.
  • about 2 months ago: ygutierrez liked this project.
  • 3 months ago: oscar-barrios liked this project.
  • 3 months ago: oscar-barrios added keyword "uyuni" to this project.
  • 3 months ago: oscar-barrios added keyword "prometheus" to this project.
  • 3 months ago: oscar-barrios added keyword "grafana" to this project.
  • 3 months ago: oscar-barrios added keyword "ai" to this project.
  • 3 months ago: oscar-barrios started this project.
  • 3 months ago: oscar-barrios left this project.
  • 3 months ago: oscar-barrios started this project.
  • 3 months ago: oscar-barrios originated this project.

  • Comments

    • oscar-barrios
      about 1 month ago by oscar-barrios | Reply

      The code of the flaky detector is here: https://github.com/srbarrios/jenkins-flaky-tests-detector

      I also published a Docker container to use it here: https://github.com/srbarrios/jenkins-flaky-tests-detector/pkgs/container/jenkins-flaky-tests-detector

      The plan now is to write a Salt state in our MLM internal infra, so it runs this container, it expose the results in a web server running on the container, and then I parse it on Grafana.

    • oscar-barrios
      about 1 month ago by oscar-barrios | Reply

      I created the new Grafana dashboard for Uyuni here: https://grafana.mgr.suse.de/d/flaky-tests/flaky-tests-detection?orgId=1&from=now-6h&to=now&timezone=browser&refresh=1m

      Next step now is to build it in a way that I can get the flaky tests for all the Jenkins job test results that we monitoring in MLM.

    • oscar-barrios
      about 1 month ago by oscar-barrios | Reply

      Now we can select any of the running test suites, and get a list of the most probable flaky tests :)

    • oscar-barrios
      30 days ago by oscar-barrios | Reply

      I will consider this hackweek done for now, to move to my second hackweek project. The outcome it has been good, I must admit that I also vibe coded some parts using Gemini 3. Also, the script analyzing the prometheus series is not relying only on a LLM call, but it also do a first triage based on a simple algorithm, saving resources to ask AI only for ambiguos and complex test failures.

    Similar Projects

    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    Set Up an Ephemeral Uyuni Instance by mbussolotto

    Description

    To test, check, and verify the latest changes in the master branch, we want to easily set up an ephemeral environment.

    Goals

    • Create an ephemeral environment manually
    • Create an ephemeral environment automatically

      Resources

    • https://github.com/uyuni-project/uyuni

    • https://www.uyuni-project.org/uyuni-docs/en/uyuni/index.html


    Enable more features in mcp-server-uyuni by j_renner

    Description

    I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

    • System operations and infos
    • System groups
    • Maintenance windows
    • Ansible
    • Reporting
    • ...

    At the end of the week I managed to enable basic system group operations:

    • List all system groups visible to the user
    • Create new system groups
    • List systems assigned to a group
    • Add and remove systems from groups

    Goals

    • Set up test environment locally with the MCP server and client + a recent MLM server [DONE]
    • Identify features and use cases offering a benefit with limited effort required for enablement [DONE]
    • Create a PR to the repo [DONE]

    Resources


    Ansible to Salt integration by vizhestkov

    Description

    We already have initial integration of Ansible in Salt with the possibility to run playbooks from the salt-master on the salt-minion used as an Ansible Control node.

    In this project I want to check if it possible to make Ansible working on the transport of Salt. Basically run playbooks with Ansible through existing established Salt (ZeroMQ) transport and not using ssh at all.

    It could be a good solution for the end users to reuse Ansible playbooks or run Ansible modules they got used to with no effort of complex configuration with existing Salt (or Uyuni/SUSE Multi Linux Manager) infrastructure.

    Goals

    • [v] Prepare the testing environment with Salt and Ansible installed
    • [v] Discover Ansible codebase to figure out possible ways of integration
    • [v] Create Salt/Uyuni inventory module
    • [v] Make basic modules to work with no using separate ssh connection, but reusing existing Salt connection
    • [v] Test some most basic playbooks

    Resources

    GitHub page

    Video of the demo


    mgr-ansible-ssh - Intelligent, Lightweight CLI for Distributed Remote Execution by deve5h

    Description

    By the end of Hack Week, the target will be to deliver a minimal functional version 1 (MVP) of a custom command-line tool named mgr-ansible-ssh (a unified wrapper for BOTH ad-hoc shell & playbooks) that allows operators to:

    1. Execute arbitrary shell commands on thousand of remote machines simultaneously using Ansible Runner with artifacts saved locally.
    2. Pass runtime options such as inventory file, remote command string/ playbook execution, parallel forks, limits, dry-run mode, or no-std-ansible-output.
    3. Leverage existing SSH trust relationships without additional setup.
    4. Provide a clean, intuitive CLI interface with --help for ease of use. It should provide consistent UX & CI-friendly interface.
    5. Establish a foundation that can later be extended with advanced features such as logging, grouping, interactive shell mode, safe-command checks, and parallel execution tuning.

    The MVP should enable day-to-day operations to efficiently target thousands of machines with a single, consistent interface.

    Goals

    Primary Goals (MVP):

    Build a functional CLI tool (mgr-ansible-ssh) capable of executing shell commands on multiple remote hosts using Ansible Runner. Test the tool across a large distributed environment (1000+ machines) to validate its performance and reliability.

    Looking forward to significantly reducing the zypper deployment time across all 351 RMT VM servers in our MLM cluster by eliminating the dependency on the taskomatic service, bringing execution down to a fraction of the current duration. The tool should also support multiple runtime flags, such as:

    mgr-ansible-ssh: Remote command execution wrapper using Ansible Runner
    
    Usage: mgr-ansible-ssh [--help] [--version] [--inventory INVENTORY]
                       [--run RUN] [--playbook PLAYBOOK] [--limit LIMIT]
                       [--forks FORKS] [--dry-run] [--no-ansible-output]
    
    Required Arguments
    --inventory, -i      Path to Ansible inventory file to use
    
    Any One of the Arguments Is Required
    --run, -r            Execute the specified shell command on target hosts
    --playbook, -p       Execute the specified Ansible playbook on target hosts
    
    Optional Arguments
    --help, -h           Show the help message and exit
    --version, -v        Show the version and exit
    --limit, -l          Limit execution to specific hosts or groups
    --forks, -f          Number of parallel Ansible forks
    --dry-run            Run in Ansible check mode (requires -p or --playbook)
    --no-ansible-output  Suppress Ansible stdout output
    

    Secondary/Stretched Goals (if time permits):

    1. Add pretty output formatting (success/failure summary per host).
    2. Implement basic logging of executed commands and results.
    3. Introduce safety checks for risky commands (shutdown, rm -rf, etc.).
    4. Package the tool so it can be installed with pip or stored internally.

    Resources

    Collaboration is welcome from anyone interested in CLI tooling, automation, or distributed systems. Skills that would be particularly valuable include:

    1. Python especially around CLI dev (argparse, click, rich)


    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    Try AI training with ROCm and LoRA by bmwiedemann

    Description

    I want to setup a Radeon RX 9600 XT 16 GB at home with ROCm on Slowroll.

    Goals

    I want to test how fast AI inference can get with the GPU and if I can use LoRA to re-train an existing free model for some task.

    Resources

    • https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html
    • https://build.opensuse.org/project/show/science:GPU:ROCm
    • https://src.opensuse.org/ROCm/
    • https://www.suse.com/c/lora-fine-tuning-llms-for-text-classification/

    Results

    got inference working with llama.cpp:

    export LLAMACPP_ROCM_ARCH=gfx1200
    HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=$LLAMACPP_ROCM_ARCH \
    -DCMAKE_BUILD_TYPE=Release -DLLAMA_CURL=ON \
    -Dhipblas_DIR=/usr/lib64/cmake/hipblaslt/ \
    && cmake --build build --config Release -j8
    m=models/gpt-oss-20b-mxfp4.gguf
    cd $P/llama.cpp && build/bin/llama-server --model $m --threads 8 --port 8005 --host 0.0.0.0 --device ROCm0 --n-gpu-layers 999
    

    Without the --device option it faulted. Maybe because my APU also appears there?

    I updated/fixed various related packages: https://src.opensuse.org/ROCm/rocm-examples/pulls/1 https://src.opensuse.org/ROCm/hipblaslt/pulls/1 SR 1320959

    benchmark

    I benchmarked inference with llama.cpp + gpt-oss-20b-mxfp4.gguf and ROCm offloading to a Radeon RX 9060 XT 16GB. I varied the number of layers that went to the GPU:

    • 0 layers 14.49 tokens/s (8 CPU cores)
    • 9 layers 17.79 tokens/s 34% VRAM
    • 15 layers 22.39 tokens/s 51% VRAM
    • 20 layers 27.49 tokens/s 64% VRAM
    • 24 layers 41.18 tokens/s 74% VRAM
    • 25+ layers 86.63 tokens/s 75% VRAM (only 200% CPU load)

    So there is a significant performance-boost if the whole model fits into the GPU's VRAM.


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    Enable more features in mcp-server-uyuni by j_renner

    Description

    I would like to contribute to mcp-server-uyuni, the MCP server for Uyuni / Multi-Linux Manager) exposing additional features as tools. There is lots of relevant features to be found throughout the API, for example:

    • System operations and infos
    • System groups
    • Maintenance windows
    • Ansible
    • Reporting
    • ...

    At the end of the week I managed to enable basic system group operations:

    • List all system groups visible to the user
    • Create new system groups
    • List systems assigned to a group
    • Add and remove systems from groups

    Goals

    • Set up test environment locally with the MCP server and client + a recent MLM server [DONE]
    • Identify features and use cases offering a benefit with limited effort required for enablement [DONE]
    • Create a PR to the repo [DONE]

    Resources


    Move Uyuni Test Framework from Selenium to Playwright + AI by oscar-barrios

    Description

    This project aims to migrate the existing Uyuni Test Framework from Selenium to Playwright. The move will improve the stability, speed, and maintainability of our end-to-end tests by leveraging Playwright's modern features. We'll be rewriting the current Selenium code in Ruby to Playwright code in TypeScript, which includes updating the test framework runner, step definitions, and configurations. This is also necessary because we're moving from Cucumber Ruby to CucumberJS.

    If you're still curious about the AI in the title, it was just a way to grab your attention. Thanks for your understanding.

    Nah, let's be honest add-emoji AI helped a lot to vibe code a good part of the Ruby methods of the Test framework, moving them to Typescript, along with the migration from Capybara to Playwright. I've been using "Cline" as plugin for WebStorm IDE, using Gemini API behind it.


    Goals

    • Migrate Core tests including Onboarding of clients
    • Improve test reliabillity: Measure and confirm a significant reduction of flakiness.
    • Implement a robust framework: Establish a well-structured and reusable Playwright test framework using the CucumberJS

    Resources