"salt-toaster" allows you to test multiple Salt package flavors across different operating systems via Docker containers. This project is heavily used on the SUSE Manager team to hardening the Salt package that is shipped on the openSUSE/SLE distributions. Link to GitHub repository

The "salt-toaster" execution is divided on different steps (image building, container spinning, salt key acceptance, tests execution, etc) but currently we only get the global results for the entire testsuite execution.

This hackweek projects wants to gather the timing profile of each execution step of the "salt-toaster" in order to export them to Prometheus (node_exporter) and vizualise them on Grafana.

Steps to follow:

  • Evaluate implementation alternatives. (accumulated value like CPU)
  • Implement timing profile inside "salt-toaster". The profile is saved in a json file collected by Prometheus "node_exporter".
  • Visualize the data, rate, trends, on Grafana.

UPDATE July 11. 2018: Goal achieved! add-emoji Exporting profile and metrics from salt-toaster to Prometheus: https://github.com/openSUSE/salt-toaster/pull/59

Looking for hackers with the skills:

python salt prometheus grafana testing saltstack

This project is part of:

Hack Week 17

Activity

  • over 7 years ago: mbologna liked this project.
  • over 7 years ago: dmaiocchi joined this project.
  • over 7 years ago: dmaiocchi liked this project.
  • over 7 years ago: PSuarezHernandez added keyword "salt" to this project.
  • over 7 years ago: PSuarezHernandez added keyword "prometheus" to this project.
  • over 7 years ago: PSuarezHernandez added keyword "grafana" to this project.
  • over 7 years ago: PSuarezHernandez added keyword "testing" to this project.
  • over 7 years ago: PSuarezHernandez added keyword "saltstack" to this project.
  • over 7 years ago: PSuarezHernandez added keyword "python" to this project.
  • over 7 years ago: PSuarezHernandez started this project.
  • over 7 years ago: PSuarezHernandez originated this project.

  • Comments

    • dmaiocchi
      over 7 years ago by dmaiocchi | Reply

      @PSuarezHernandez i would like to help add-emoji .

      We could create a separate github repo called "salt-toaster-metrics", and starting from there we can cordinate.

      I will do also my hackweek on elixir but i would like to help on this also. If we have github Repo we can create issue and dashboards for cordination.

      If we want at the end to push it back to salt-toaster this can be easy.

      What do you think? add-emoji

    Similar Projects

    Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

    Description

    git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

    Supports:

    • Verifying commit authenticity signed using gpg key
    • Checking out trusted commits

    Ideal for teams and projects where the integrity of git history is crucial.

    Goals

    A minimal python code of the shell script exists as a pull request.

    The goal of this hackweek is to:

    • Add more unit tests
    • Make the python code modular
    • Add code coverage if possible

    Resources


    Improve chore and screen time doc generator script `wochenplaner` by gniebler

    Description

    I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.

    I named this script wochenplaner and have been using it for a few months now.

    It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.

    Goals

    • Fix chore field separation lines
    • Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
    • Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.

    Resources

    tbd (Gitlab repo)


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Improve/rework household chore tracker `chorazon` by gniebler

    Description

    I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

    It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

    There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

    Days are not rolled over automatically, to allow for task completion control.

    We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

    It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

    Goals

    • Add admin screens for users, tasks and schedules
    • Add models, pages etc. to allow redeeming tokens for gifts/surprises
    • …?

    Resources

    tbd (Gitlab repo)


    Bring to Cockpit + System Roles capabilities from YAST by miguelpc

    Bring to Cockpit + System Roles features from YAST

    Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.

    Goals

    The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.

    Resources

    A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit

    Linux System Roles: https://linux-system-roles.github.io/


    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources


    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources


    Uyuni Health-check Grafana Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check