Project Description

A supportconfig provides a lot of files and data from the system, but it is often hard to spot the real issue in it. The idea of this project is to get machine-readable output for the supportconfig data and analyze them.

Then we would try to provide hints using the tool about what is wrong.

The name of this tool is: uyuni-health-check.

GitHub repository: https://github.com/uyuni-project/poc-uyuni-health-check

Summary:

  • Research about machine learning log anomaly detectors: few alternatives out there.
  • Getting custom metrics for Salt and Uyuni via prometheus exporter from live server.
  • Setting up Loki to process relevant Uyuni logs from live server.
  • Allow data visualization with Grafana.
  • Really easy-to-use CLI tool to run "health checks" and get feedback.

Details:

  • Grafana, Loki, Uyuni prometheus exporter and all other components run on "containers"
  • The containers run on the Uyuni server. "podman" is required on the server.
  • CLI tool takes care of building and deploying the "container" image to the server, collect the metrics and provide output on the command line.
  • Prometheus / Grafana expose containers metrics.

Goals for Hackweek #23

  • Enhance and collect more Uyuni / Salt metrics.
  • Use "supportconfig" as source for logs/metrics instead of live server.

Achievements during HW #23

  • ...

Goals for Hackweek #22

  • Improve CLI and performance.
  • Fix memory leak on "uyuni-health-exporter".
  • Complete automated deployment of Loki and other containers.

Achievements during HW #22:

  • Fix memory leak on uyuni-health-exporter.
  • Fix python packaging and installation.
  • Deploy grafana and prometheus dashboard.
  • Fix loki and promtail deployments.
  • Run all containers in the same POD.
  • Unify console logging across deployment functions.
  • More friendly CLI with new functions.
  • Containers are not wiped by default after executions.
  • Minor and cosmetic changes.
  • Update README.md to reflect latest changes

Goals for this Hackweek #21

  • Getting a machine readable version of supportconfig
  • First analysis and tweaking

This project is part of:

Hack Week 21 Hack Week 22 Hack Week 23

Activity

  • about 1 year ago: pinvernizzi liked this project.
  • about 1 year ago: oscar-barrios liked this project.
  • about 1 year ago: juliogonzalezgil liked this project.
  • over 1 year ago: emendonca liked this project.
  • over 2 years ago: cbosdonnat added keyword "uyuni" to this project.
  • over 2 years ago: cbosdonnat added keyword "susemanager" to this project.
  • over 2 years ago: cbosdonnat added keyword "monitoring" to this project.
  • over 2 years ago: cbosdonnat added keyword "grafana" to this project.
  • over 2 years ago: cbosdonnat added keyword "loki" to this project.
  • over 2 years ago: cbosdonnat added keyword "prometheus" to this project.
  • over 2 years ago: cbosdonnat added keyword "python3" to this project.
  • over 2 years ago: rangelino liked this project.
  • over 2 years ago: ygutierrez liked this project.
  • over 2 years ago: cbbayburt liked this project.
  • over 2 years ago: j_renner liked this project.
  • over 2 years ago: mbussolotto liked this project.
  • over 2 years ago: firoyang liked this project.
  • over 2 years ago: PSuarezHernandez joined this project.
  • over 2 years ago: PSuarezHernandez liked this project.
  • over 2 years ago: cbosdonnat started this project.
  • over 2 years ago: cbosdonnat added keyword "supportconfig" to this project.
  • over 2 years ago: cbosdonnat added keyword "analysis" to this project.
  • over 2 years ago: cbosdonnat added keyword "tool" to this project.
  • over 2 years ago: cbosdonnat added keyword "dashboard" to this project.
  • over 2 years ago: cbosdonnat originated this project.

  • Comments

    • PSuarezHernandez
      almost 2 years ago by PSuarezHernandez | Reply

      I've updated project description to reflect latest changes after Hackweek 22!

    • PSuarezHernandez
      over 1 year ago by PSuarezHernandez | Reply

      Let's keep hacking on this project during upcoming Hackweek 23!

    Similar Projects

    suse-rancher-supportconfig by eminguez

    Description

    Update: Live at https://github.com/e-minguez/suse-rancher-supportconfig I finally didn't used golang but used gum instead add-emoji

    SUSE's supportconfig support tool collects data from the SUSE Operating system. Rancher's rancher2_logs_collector.sh support tool does the same for RKE2/K3s.

    Wouldn't be nice to have a way to run both and collect all data for SUSE based RKE2/K3s clusters? Wouldn't be even better with a fancy TUI tool like bubbletea?

    Ideally the output should be an html page where you can see the logs/data directly from the browser.

    Goals

    • Familiarize myself with both supportconfig and rancher2_logs_collector.sh tools
    • Refresh my golang knowledge
    • Have something that works at the end of the hackweek ("works" may vary add-emoji )
    • Be better in naming things

    Resources

    All links provided above as well as huh


    Jenny Static Site Generator by adam.pickering

    Description

    For my personal site I have been using hugo. It works, but I am not satisfied: every time I want to make a change (which is infrequently) I have to read through the documentation again to understand how hugo works. I don't find the documentation easy to use, and the structure of the repository that hugo requires is unintuitive/more complex than what I need. So, I have decided to write my own simple static site generator in Go. It is named Jenny, after my wife.

    Goals

    • Pages can be written in markdown (which is automatically converted to HTML), but other file types are also allowed
    • Easy to understand and use
      • Intuitive, simple design
      • Clear documentation
      • Hot reloading
      • Binaries provided for download
    • Future maintenance is easy
      • Automated releases

    Resources

    https://github.com/adamkpickering/jenny


    Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov

    Project Description

    Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.

    In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.

    Continue with implementation of the missing features and improve the existing implementation:

    • authentication (need to decide how it should be/or not related to salt auth)

    • web service providing the control of states deployment

    Goal for this Hackweek

    • Implement missing key features

    • Implement the tool for state deployment control with CLI

    Resources

    https://github.com/openSUSE/saline


    Update my own python audio and video time-lapse and motion capture apps and publish by dmair

    Project Description

    Many years ago, in my own time, I wrote a Qt python application to periodically capture frames from a V4L2 video device (e.g. a webcam) and used it to create daily weather timelapse videos from windows at my home. I have maintained it at home in my own time and this year have added motion detection making it a functional video security tool but with no guarantees. I also wrote a linux audio monitoring app in python using Qt in my own time that captures live signal strength along with 24 hour history of audio signal level/range and audio spectrum. I recently added background noise filtering to the app. In due course I aim to include voice detection, currently I'm assuming via Google's public audio interface. Neither of these is a professional home security app but between them they permit a user to freely monitor video and audio data from a home in a manageable way. Both projects are on github but out-of-date with personal work, I would like to organize and update the github versions of these projects.

    Goal for this Hackweek

    It would probably help to migrate all the v4l2py module based video code to linuxpy.video based code and that looks like a re-write of large areas of the video code. It would also be good to remove a lot of python lint that is several years old to improve the projects with the main goal being to push the recent changes with better organized code to github. If there is enough time I'd like to take the in-line Qt QSettings persistent state code used per-app and write a python class that encapsulates the Qt QSettings class in a value_of(name)/name=value manner for shared use in projects so that persistent state can be accessed read or write anywhere within the apps using a simple interface.

    Resources

    I'm not specifically looking for help but welcome other input.


    Enhance UV openQA helper script by mdonis

    Description

    A couple months ago an UV openQA helper script was created to help/automate the searching phase inside openQA for a given MU to test. The script searches inside all our openQA job groups (qam-sle) related with a given MU and generates an output suitable to add (copy & paste) inside the update log.

    This is still a WIP and could use some enhancements.

    Goals

    • Move script from bash to python: this would be useful in case we want to include this into MTUI in the future. The script will be separate from MTUI for now. The idea is to have this as a CLI tool using the click library or something similar.
    • Add option to look for jobs in other sections inside aggregated updates: right now, when looking for regression tests under aggregated updates for a given MU, the script only looks inside the Core MU job group. This is where most of the regression tests we need are located, but some MUs have their regression tests under the YaST/Containers/Security MU job groups. We should keep the Core MU group as a default, but add an option to be able to look into other job groups under aggregated updates.
    • Remove the -a option: this option is used to indicate the update ID and is mandatory right now. This is a bit weird and goes against posix stardards. It was developed this way in order to avoid using positional parameters. This problem should be fixed if we move the script to python.

    Some other ideas to consider:

    • Look into the QAM dashboard API. This has more info on each MU, could use this to link general openQA build results, whether the related RR is approved or not, etc
    • Make it easier to see if there's regression tests for a package in an openQA test build. Check if there's a possibility to search for tests that have the package name in them inside each testsuite.
    • Unit testing?

    More ideas TBD

    Resources

    https://github.com/os-autoinst/scripts/blob/master/openqa-search-maintenance-core-jobs

    https://confluence.suse.com/display/maintenanceqa/Guide+on+how+to+test+Updates

    Post-Hackweek update

    All major features were implemented. Unit tests are still in progress, and project will be moved to the SUSE github org once everything's done. https://github.com/mjdonis/oqa-search


    Selenium with Python by xguo

    Description

    Try to create test case about Selenium base on Python

    Goals

    • Knowledge about Selenium with Python
    • Create new test case about Selenium

    Resources

    https://selenium-python.readthedocs.io/ https://www.selenium.dev/


    Small healthcheck tool for Longhorn by mbrookhuis

    Project Description

    We have often problems (e.g. pods not starting) that are related to PVCs not running, cluster (nodes) not all up or deployments not running or completely running. This all prevents administration activities. Having something that can regular be run to validate the status of the cluster would be helpful, and not as of today do a lot of manual tasks.

    As addition (read enough time), we could add changing reservation, adding new disks, etc. --> This didn't made it. But the scripts can easily be adopted.

    This tool would decrease troubleshooting time, giving admins rights to the rancher GUI and could be used in automation.

    Goal for this Hackweek

    At the end we should have a small python tool that is doing a (very) basic health check on nodes, deployments and PVCs. First attempt was to make it in golang, but that was taking to much time.

    Overview

    This tool will run a simple healthcheck on a kubernetes cluster. It will perform the following actions:

    • node check: This will check all nodes, and display the status and the k3s version. If the status of the nodes is not "Ready" (this should be only reported), the cluster will be reported as having problems

    • deployment check: This check will list all deployments, and display the number of expected replicas and the used replica. If there are unused replicas this will be displayed. The cluster will be reported as having problems.

    • pvc check: This check will list of all pvc's, and display the status and the robustness. If the robustness is not "Healthy", the cluster will be reported as having problems.

    If there is a problem registered in the checks, there will be a warning that the cluster is not healthy and the program will exit with 1.

    The script has 1 mandatory parameter and that is the kubeconf of the cluster or of a node off the cluster.

    The code is writen for Python 3.11, but will also work on 3.6 (the default with SLES15.x). There is a venv present that will contain all needed packages. Also, the script can be run on the cluster itself or any other linux server.

    Installation

    To install this project, perform the following steps:

    • Create the directory /opt/k8s-check

    mkdir /opt/k8s-check

    • Copy all the file to this directory and make the following changes:

    chmod +x k8s-check.py


    Symbol Relations by hli

    Description

    There are tools to build function call graphs based on parsing source code, for example, cscope.

    This project aims to achieve a similar goal by directly parsing the disasembly (i.e. objdump) of a compiled binary. The assembly code is what the CPU sees, therefore more "direct". This may be useful in certain scenarios, such as gdb/crash debugging.

    Detailed description and Demos can be found in the README file:

    Supports x86 for now (because my customers only use x86 machines), but support for other architectures can be added easily.

    Tested with python3.6

    Goals

    Any comments are welcome.

    Resources

    https://github.com/lhb-cafe/SymbolRelations

    symrellib.py: mplements the symbol relation graph and the disassembly parser

    symrel_tracer*.py: implements tracing (-t option)

    symrel.py: "cli parser"


    Uyuni developer-centric documentation by deneb_alpha

    Description

    While we currently have extensive documentation on user-oriented tasks such as adding minions, patching, fine-tuning, etc, there is a notable gap when it comes to centralizing and documenting core functionalities for developers.

    The number of functionalities and side tools we have in Uyuni can be overwhelming. It would be nice to have a centralized place with descriptive list of main/core functionalities.

    Goals

    Create, aggregate and review on the Uyuni wiki a set of resources, focused on developers, that include also some known common problems/troubleshooting.

    The documentation will be helpful not only for everyone who is trying to learn the functionalities with all their inner processes like newcomer developers or community enthusiasts, but also for anyone who need a refresh.

    Resources

    The resources are currently aggregated here: https://github.com/uyuni-project/uyuni/wiki


    Automated Test Report reviewer by oscar-barrios

    Description

    In SUMA/Uyuni team we spend a lot of time reviewing test reports, analyzing each of the test cases failing, checking if the test is a flaky test, checking logs, etc.

    Goals

    Speed up the review by automating some parts through AI, in a way that we can consume some summary of that report that could be meaningful for the reviewer.

    Resources

    No idea about the resources yet, but we will make use of:

    • HTML/JSON Report (text + screenshots)
    • The Test Suite Status GithHub board (via API)
    • The environment tested (via SSH)
    • The test framework code (via files)


    Edge Image Builder and mkosi for Uyuni by oholecek

    Description

    One part of Uyuni system management tool is ability to build custom images. Currently Uyuni supports only Kiwi image builder.

    Kiwi however is not the only image building system out there and with the goal to also become familiar with other systems, this projects aim to add support for Edge Image builder and systemd's mkosi systems.

    Goals

    Uyuni is able to

    • provision EIB and mkosi build hosts
    • build EIB and mkosi images and store them

    Resources

    • Uyuni - https://github.com/uyuni-project/uyuni
    • Edge Image builder - https://github.com/suse-edge/edge-image-builder
    • mkosi - https://github.com/systemd/mkosi


    Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov

    Project Description

    Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.

    In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.

    Continue with implementation of the missing features and improve the existing implementation:

    • authentication (need to decide how it should be/or not related to salt auth)

    • web service providing the control of states deployment

    Goal for this Hackweek

    • Implement missing key features

    • Implement the tool for state deployment control with CLI

    Resources

    https://github.com/openSUSE/saline


    Improve Development Environment on Uyuni by mbussolotto

    Description

    Currently create a dev environment on Uyuni might be complicated. The steps are:

    • add the correct repo
    • download packages
    • configure your IDE (checkstyle, format rules, sonarlint....)
    • setup debug environment
    • ...

    The current doc can be improved: some information are hard to be find out, some others are completely missing.

    Dev Container might solve this situation.

    Goals

    Uyuni development in no time:

    • using VSCode:
      • setting.json should contains all settings (for all languages in Uyuni, with all checkstyle rules etc...)
      • dev container should contains all dependencies
      • setup debug environment
    • implement a GitHub Workspace solution
    • re-write documentation

    Lots of pieces are already implemented: we need to connect them in a consistent solution.

    Resources

    • https://github.com/uyuni-project/uyuni/wiki


    Improve Development Environment on Uyuni by mbussolotto

    Description

    Currently create a dev environment on Uyuni might be complicated. The steps are:

    • add the correct repo
    • download packages
    • configure your IDE (checkstyle, format rules, sonarlint....)
    • setup debug environment
    • ...

    The current doc can be improved: some information are hard to be find out, some others are completely missing.

    Dev Container might solve this situation.

    Goals

    Uyuni development in no time:

    • using VSCode:
      • setting.json should contains all settings (for all languages in Uyuni, with all checkstyle rules etc...)
      • dev container should contains all dependencies
      • setup debug environment
    • implement a GitHub Workspace solution
    • re-write documentation

    Lots of pieces are already implemented: we need to connect them in a consistent solution.

    Resources

    • https://github.com/uyuni-project/uyuni/wiki


    Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov

    Project Description

    Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.

    In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.

    Continue with implementation of the missing features and improve the existing implementation:

    • authentication (need to decide how it should be/or not related to salt auth)

    • web service providing the control of states deployment

    Goal for this Hackweek

    • Implement missing key features

    • Implement the tool for state deployment control with CLI

    Resources

    https://github.com/openSUSE/saline


    Create SUSE Manager users from ldap/ad groups by mbrookhuis

    Description

    This tool is used to create users in SUSE Manager Server based on LDAP/AD groups. For each LDAP/AD group a role within SUSE Manager Server is defined. Also, the tool will check if existing users still have the role they should have, and, if not, it will be corrected. The same for if a user is disabled, it will be enabled again. If a users is not present in the LDAP/AD groups anymore, it will be disabled or deleted, depending on the configuration.

    The code is written for Python 3.6 (the default with SLES15.x), but will also work with newer versions. And works against SUSE Manger 4.3 and 5.x

    Goals

    Create a python and/or golang utility that will manage users in SUSE Manager based on LDAP/AD group-membership. In a configuration file is defined which roles the members of a group will get.

    Table of contents

    Installation

    To install this project, perform the following steps:

    • Be sure that python 3.6 is installed and also the module python3-PyYAML. Also the ldap3 module is needed:

    bash zypper in python3 python3-PyYAML pip install yaml

    • On the server or PC, where it should run, create a directory. On linux, e.g. /opt/sm-ldap-users

    • Copy all the file to this directory.

    • Edit the configsm.yaml. All parameters should be entered. Tip: for the ldap information, the best would be to use the same as for SSSD.

    • Be sure that the file sm-ldap-users.py is executable. It would be good to change the owner to root:root and only root can read and execute:

    bash chmod 600 * chmod 700 sm-ldap-users.py chown root:root *

    Usage

    This is very simple. Once the configsm.yaml contains the correct information, executing the following will do the magic:

    bash /sm-ldap-users.py

    repository link

    https://github.com/mbrookhuis/sm-ldap-users


    Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil

    Join the Gitter channel! https://gitter.im/uyuni-project/hackweek

    Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!

    Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.

    For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.

    No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)

    The idea is testing Salt and Salt-ssh clients, but NOT traditional clients, which are deprecated.

    To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):

    1. Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    2. Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    3. Package management (install, remove, update...)
    4. Patching
    5. Applying any basic salt state (including a formula)
    6. Salt remote commands
    7. Bonus point: Java part for product identification, and monitoring enablement
    8. Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    9. Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
    10. Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)

    If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)

    • If you don't have knowledge about some of the steps: ask the team
    • If you still don't know what to do: switch to another distribution and keep testing.

    This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)

    Pending

    FUSS

    FUSS is a complete GNU/Linux solution (server, client and desktop/standalone) based on Debian for managing an educational network.

    https://fuss.bz.it/

    Seems to be a Debian 12 derivative, so adding it could be quite easy.

    • [W] Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    • [W] Onboarding (salt minion from UI, salt minion from bootstrap script, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator) --> Working for all 3 options (salt minion UI, salt minion bootstrap script and salt-ssh minion from the UI).
    • [W] Package management (install, remove, update...) --> Installing a new package works, needs to test the rest.
    • [I] Patching (if patch information is available, could require writing some code to parse it, but IIRC we have support for Ubuntu already). No patches detected. Do we support patches for Debian at all?
    • [W] Applying any basic salt state (including a formula)
    • [W] Salt remote commands
    • [ ] Bonus point: Java part for product identification, and monitoring enablement