an invention by cbosdonnat
Project Description
A supportconfig
provides a lot of files and data from the system, but it is often hard to spot the real issue in it. The idea of this project is to get machine-readable output for the supportconfig data and analyze them.
Then we would try to provide hints using the tool about what is wrong.
The name of this tool is: uyuni-health-check
.
GitHub repository: https://github.com/uyuni-project/poc-uyuni-health-check
Summary:
- Research about machine learning log anomaly detectors: few alternatives out there.
- Getting custom metrics for Salt and Uyuni via prometheus exporter from live server.
- Setting up Loki to process relevant Uyuni logs from live server.
- Allow data visualization with Grafana.
- Really easy-to-use CLI tool to run "health checks" and get feedback.
Details:
- Grafana, Loki, Uyuni prometheus exporter and all other components run on "containers"
- The containers run on the Uyuni server. "podman" is required on the server.
- CLI tool takes care of building and deploying the "container" image to the server, collect the metrics and provide output on the command line.
- Prometheus / Grafana expose containers metrics.
Goals for Hackweek #23
- Enhance and collect more Uyuni / Salt metrics.
- Use "supportconfig" as source for logs/metrics instead of live server.
Achievements during HW #23
- ...
Goals for Hackweek #22
- Improve CLI and performance.
- Fix memory leak on "uyuni-health-exporter".
- Complete automated deployment of Loki and other containers.
Achievements during HW #22:
- Fix memory leak on uyuni-health-exporter.
- Fix python packaging and installation.
- Deploy grafana and prometheus dashboard.
- Fix loki and promtail deployments.
- Run all containers in the same POD.
- Unify console logging across deployment functions.
- More friendly CLI with new functions.
- Containers are not wiped by default after executions.
- Minor and cosmetic changes.
- Update README.md to reflect latest changes
Goals for this Hackweek #21
- Getting a machine readable version of supportconfig
- First analysis and tweaking
Looking for hackers with the skills:
supportconfig analysis tool dashboard monitoring grafana loki prometheus python3 uyuni susemanager
This project is part of:
Hack Week 21 Hack Week 22 Hack Week 23
Activity
Comments
-
almost 2 years ago by PSuarezHernandez | Reply
I've updated project description to reflect latest changes after Hackweek 22!
-
about 1 year ago by PSuarezHernandez | Reply
Let's keep hacking on this project during upcoming Hackweek 23!
Similar Projects
suse-rancher-supportconfig by eminguez
Description
Update: Live at https://github.com/e-minguez/suse-rancher-supportconfig I finally didn't used golang but used gum instead
SUSE's supportconfig
support tool collects data from the SUSE Operating system. Rancher's rancher2_logs_collector.sh
support tool does the same for RKE2/K3s.
Wouldn't be nice to have a way to run both and collect all data for SUSE based RKE2/K3s clusters? Wouldn't be even better with a fancy TUI tool like bubbletea?
Ideally the output should be an html page where you can see the logs/data directly from the browser.
Goals
- Familiarize myself with both
supportconfig
andrancher2_logs_collector.sh
tools - Refresh my golang knowledge
- Have something that works at the end of the hackweek ("works" may vary )
- Be better in naming things
Resources
All links provided above as well as huh
Jenny Static Site Generator by adam.pickering
Description
For my personal site I have been using hugo. It works, but I am not satisfied: every time I want to make a change (which is infrequently) I have to read through the documentation again to understand how hugo works. I don't find the documentation easy to use, and the structure of the repository that hugo requires is unintuitive/more complex than what I need. So, I have decided to write my own simple static site generator in Go. It is named Jenny, after my wife.
Goals
- Pages can be written in markdown (which is automatically converted to HTML), but other file types are also allowed
- Easy to understand and use
- Intuitive, simple design
- Clear documentation
- Hot reloading
- Binaries provided for download
- Future maintenance is easy
- Automated releases
Resources
https://github.com/adamkpickering/jenny
Update my own python audio and video time-lapse and motion capture apps and publish by dmair
Project Description
Many years ago, in my own time, I wrote a Qt python application to periodically capture frames from a V4L2 video device (e.g. a webcam) and used it to create daily weather timelapse videos from windows at my home. I have maintained it at home in my own time and this year have added motion detection making it a functional video security tool but with no guarantees. I also wrote a linux audio monitoring app in python using Qt in my own time that captures live signal strength along with 24 hour history of audio signal level/range and audio spectrum. I recently added background noise filtering to the app. In due course I aim to include voice detection, currently I'm assuming via Google's public audio interface. Neither of these is a professional home security app but between them they permit a user to freely monitor video and audio data from a home in a manageable way. Both projects are on github but out-of-date with personal work, I would like to organize and update the github versions of these projects.
Goal for this Hackweek
It would probably help to migrate all the v4l2py module based video code to linuxpy.video based code and that looks like a re-write of large areas of the video code. It would also be good to remove a lot of python lint that is several years old to improve the projects with the main goal being to push the recent changes with better organized code to github. If there is enough time I'd like to take the in-line Qt QSettings persistent state code used per-app and write a python class that encapsulates the Qt QSettings class in a value_of(name)/name=value manner for shared use in projects so that persistent state can be accessed read or write anywhere within the apps using a simple interface.
Resources
I'm not specifically looking for help but welcome other input.
Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov
Project Description
Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.
In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.
Continue with implementation of the missing features and improve the existing implementation:
authentication (need to decide how it should be/or not related to salt auth)
web service providing the control of states deployment
Goal for this Hackweek
Implement missing key features
Implement the tool for state deployment control with CLI
Resources
https://github.com/openSUSE/saline
Symbol Relations by hli
Description
There are tools to build function call graphs based on parsing source code, for example, cscope
.
This project aims to achieve a similar goal by directly parsing the disasembly (i.e. objdump) of a compiled binary. The assembly code is what the CPU sees, therefore more "direct". This may be useful in certain scenarios, such as gdb/crash debugging.
Detailed description and Demos can be found in the README file:
Supports x86 for now (because my customers only use x86 machines), but support for other architectures can be added easily.
Tested with python3.6
Goals
Any comments are welcome.
Resources
https://github.com/lhb-cafe/SymbolRelations
symrellib.py: mplements the symbol relation graph and the disassembly parser
symrel_tracer*.py: implements tracing (-t option)
symrel.py: "cli parser"
Small healthcheck tool for Longhorn by mbrookhuis
Project Description
We have often problems (e.g. pods not starting) that are related to PVCs not running, cluster (nodes) not all up or deployments not running or completely running. This all prevents administration activities. Having something that can regular be run to validate the status of the cluster would be helpful, and not as of today do a lot of manual tasks.
As addition (read enough time), we could add changing reservation, adding new disks, etc. --> This didn't made it. But the scripts can easily be adopted.
This tool would decrease troubleshooting time, giving admins rights to the rancher GUI and could be used in automation.
Goal for this Hackweek
At the end we should have a small python tool that is doing a (very) basic health check on nodes, deployments and PVCs. First attempt was to make it in golang, but that was taking to much time.
Overview
This tool will run a simple healthcheck on a kubernetes cluster. It will perform the following actions:
node check: This will check all nodes, and display the status and the k3s version. If the status of the nodes is not "Ready" (this should be only reported), the cluster will be reported as having problems
deployment check: This check will list all deployments, and display the number of expected replicas and the used replica. If there are unused replicas this will be displayed. The cluster will be reported as having problems.
pvc check: This check will list of all pvc's, and display the status and the robustness. If the robustness is not "Healthy", the cluster will be reported as having problems.
If there is a problem registered in the checks, there will be a warning that the cluster is not healthy and the program will exit with 1.
The script has 1 mandatory parameter and that is the kubeconf of the cluster or of a node off the cluster.
The code is writen for Python 3.11, but will also work on 3.6 (the default with SLES15.x). There is a venv present that will contain all needed packages. Also, the script can be run on the cluster itself or any other linux server.
Installation
To install this project, perform the following steps:
- Create the directory /opt/k8s-check
mkdir /opt/k8s-check
- Copy all the file to this directory and make the following changes:
chmod +x k8s-check.py
Enhance UV openQA helper script by mdonis
Description
A couple months ago an UV openQA helper script was created to help/automate the searching phase inside openQA for a given MU to test. The script searches inside all our openQA job groups (qam-sle) related with a given MU and generates an output suitable to add (copy & paste) inside the update log.
This is still a WIP and could use some enhancements.
Goals
- Move script from bash to python: this would be useful in case we want to include this into MTUI in the future. The script will be separate from MTUI for now. The idea is to have this as a CLI tool using the click library or something similar.
- Add option to look for jobs in other sections inside aggregated updates: right now, when looking for regression tests under aggregated updates for a given MU, the script only looks inside the Core MU job group. This is where most of the regression tests we need are located, but some MUs have their regression tests under the YaST/Containers/Security MU job groups. We should keep the Core MU group as a default, but add an option to be able to look into other job groups under aggregated updates.
- Remove the
-a
option: this option is used to indicate the update ID and is mandatory right now. This is a bit weird and goes against posix stardards. It was developed this way in order to avoid using positional parameters. This problem should be fixed if we move the script to python.
Some other ideas to consider:
- Look into the QAM dashboard API. This has more info on each MU, could use this to link general openQA build results, whether the related RR is approved or not, etc
- Make it easier to see if there's regression tests for a package in an openQA test build. Check if there's a possibility to search for tests that have the package name in them inside each testsuite.
- Unit testing?
More ideas TBD
Resources
https://github.com/os-autoinst/scripts/blob/master/openqa-search-maintenance-core-jobs
https://confluence.suse.com/display/maintenanceqa/Guide+on+how+to+test+Updates
Post-Hackweek update
All major features were implemented. Unit tests are still in progress, and project will be moved to the SUSE github org once everything's done. https://github.com/mjdonis/oqa-search
Selenium with Python by xguo
Description
Try to create test case about Selenium base on Python
Goals
- Knowledge about Selenium with Python
- Create new test case about Selenium
Resources
https://selenium-python.readthedocs.io/ https://www.selenium.dev/
Saltboot ability to deploy OEM images by oholecek
Description
Saltboot is a system deployment part of Uyuni. It is the mechanism behind deploying Kiwi built system images from central Uyuni server location.
System image is when the image is only of one partition and does not contain whole disk image and deployment system has to take care of partitioning, fstab on top of integrity validation.
However systems like Aeon, SUSE Linux Enterprise Micro and similar are distributed as disk images (also so called OEM images). Saltboot currently cannot deploy these systems.
The main problem to saltboot is however that currently saltboot support is built into the image itself. This step is not desired when using OEM images.
Goals
Saltboot needs to be standalone and be able to deploy OEM images. Responsibility of saltboot would then shrink to selecting correct image, image integrity validation, deployment and boot to deployed system.
Resources
- Saltboot - https://github.com/uyuni-project/retail/tree/master
- Uyuni - https://github.com/uyuni-project/uyuni
Uyuni developer-centric documentation by deneb_alpha
Description
While we currently have extensive documentation on user-oriented tasks such as adding minions, patching, fine-tuning, etc, there is a notable gap when it comes to centralizing and documenting core functionalities for developers.
The number of functionalities and side tools we have in Uyuni can be overwhelming. It would be nice to have a centralized place with descriptive list of main/core functionalities.
Goals
Create, aggregate and review on the Uyuni wiki a set of resources, focused on developers, that include also some known common problems/troubleshooting.
The documentation will be helpful not only for everyone who is trying to learn the functionalities with all their inner processes like newcomer developers or community enthusiasts, but also for anyone who need a refresh.
Resources
The resources are currently aggregated here: https://github.com/uyuni-project/uyuni/wiki
Enable the containerized Uyuni server to run on different host OS by j_renner
Description
The Uyuni server is provided as a container, but we still require it to run on Leap Micro? This is not how people expect to use containerized applications, so it would be great if we tested other host OSs and enabled them by providing builds of necessary tools for (e.g. mgradm). Interesting candidates should be:
- openSUSE Leap
- Cent OS 7
- Ubuntu
- ???
Goals
Make it really easy for anyone to run the Uyuni containerized server on whatever OS they want (with support for containers of course).
Improve Development Environment on Uyuni by mbussolotto
Description
Currently create a dev environment on Uyuni might be complicated. The steps are:
- add the correct repo
- download packages
- configure your IDE (checkstyle, format rules, sonarlint....)
- setup debug environment
- ...
The current doc can be improved: some information are hard to be find out, some others are completely missing.
Dev Container might solve this situation.
Goals
Uyuni development in no time:
- using VSCode:
- setting.json should contains all settings (for all languages in Uyuni, with all checkstyle rules etc...)
- dev container should contains all dependencies
- setup debug environment
- implement a GitHub Workspace solution
- re-write documentation
Lots of pieces are already implemented: we need to connect them in a consistent solution.
Resources
- https://github.com/uyuni-project/uyuni/wiki
Automated Test Report reviewer by oscar-barrios
Description
In SUMA/Uyuni team we spend a lot of time reviewing test reports, analyzing each of the test cases failing, checking if the test is a flaky test, checking logs, etc.
Goals
Speed up the review by automating some parts through AI, in a way that we can consume some summary of that report that could be meaningful for the reviewer.
Resources
No idea about the resources yet, but we will make use of:
- HTML/JSON Report (text + screenshots)
- The Test Suite Status GithHub board (via API)
- The environment tested (via SSH)
- The test framework code (via files)
Improve Development Environment on Uyuni by mbussolotto
Description
Currently create a dev environment on Uyuni might be complicated. The steps are:
- add the correct repo
- download packages
- configure your IDE (checkstyle, format rules, sonarlint....)
- setup debug environment
- ...
The current doc can be improved: some information are hard to be find out, some others are completely missing.
Dev Container might solve this situation.
Goals
Uyuni development in no time:
- using VSCode:
- setting.json should contains all settings (for all languages in Uyuni, with all checkstyle rules etc...)
- dev container should contains all dependencies
- setup debug environment
- implement a GitHub Workspace solution
- re-write documentation
Lots of pieces are already implemented: we need to connect them in a consistent solution.
Resources
- https://github.com/uyuni-project/uyuni/wiki
Create SUSE Manager users from ldap/ad groups by mbrookhuis
Description
This tool is used to create users in SUSE Manager Server based on LDAP/AD groups. For each LDAP/AD group a role within SUSE Manager Server is defined. Also, the tool will check if existing users still have the role they should have, and, if not, it will be corrected. The same for if a user is disabled, it will be enabled again. If a users is not present in the LDAP/AD groups anymore, it will be disabled or deleted, depending on the configuration.
The code is written for Python 3.6 (the default with SLES15.x), but will also work with newer versions. And works against SUSE Manger 4.3 and 5.x
Goals
Create a python and/or golang utility that will manage users in SUSE Manager based on LDAP/AD group-membership. In a configuration file is defined which roles the members of a group will get.
Table of contents
Installation
To install this project, perform the following steps:
- Be sure that python 3.6 is installed and also the module python3-PyYAML. Also the ldap3 module is needed:
bash
zypper in python3 python3-PyYAML
pip install yaml
On the server or PC, where it should run, create a directory. On linux, e.g. /opt/sm-ldap-users
Copy all the file to this directory.
Edit the configsm.yaml. All parameters should be entered. Tip: for the ldap information, the best would be to use the same as for SSSD.
Be sure that the file sm-ldap-users.py is executable. It would be good to change the owner to root:root and only root can read and execute:
bash
chmod 600 *
chmod 700 sm-ldap-users.py
chown root:root *
Usage
This is very simple. Once the configsm.yaml contains the correct information, executing the following will do the magic:
bash
/sm-ldap-users.py
repository link
https://github.com/mbrookhuis/sm-ldap-users
Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov
Project Description
Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.
In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.
Continue with implementation of the missing features and improve the existing implementation:
authentication (need to decide how it should be/or not related to salt auth)
web service providing the control of states deployment
Goal for this Hackweek
Implement missing key features
Implement the tool for state deployment control with CLI
Resources
https://github.com/openSUSE/saline
Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil
Join the Gitter channel! https://gitter.im/uyuni-project/hackweek
Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!
Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.
For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.
No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)
The idea is testing Salt and Salt-ssh clients, but NOT traditional clients, which are deprecated.
To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):
- Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
- Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
- Package management (install, remove, update...)
- Patching
- Applying any basic salt state (including a formula)
- Salt remote commands
- Bonus point: Java part for product identification, and monitoring enablement
- Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
- Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
- Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)
If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)
- If you don't have knowledge about some of the steps: ask the team
- If you still don't know what to do: switch to another distribution and keep testing.
This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)
Pending
FUSS
FUSS is a complete GNU/Linux solution (server, client and desktop/standalone) based on Debian for managing an educational network.
https://fuss.bz.it/
Seems to be a Debian 12 derivative, so adding it could be quite easy.
[W]
Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)[W]
Onboarding (salt minion from UI, salt minion from bootstrap script, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator) --> Working for all 3 options (salt minion UI, salt minion bootstrap script and salt-ssh minion from the UI).[W]
Package management (install, remove, update...) --> Installing a new package works, needs to test the rest.[I]
Patching (if patch information is available, could require writing some code to parse it, but IIRC we have support for Ubuntu already). No patches detected. Do we support patches for Debian at all?[W]
Applying any basic salt state (including a formula)[W]
Salt remote commands[ ]
Bonus point: Java part for product identification, and monitoring enablement