Following trademark and licensing issues with Grafana, explore the possibility of debranding Grafana and use that in SUSE Manager (and maybe others)
Products are available from GitLab: https://gitlab.suse.de/susana
Screencast: https://youtu.be/h_-fk9QiS6U
This project is part of:
Hack Week 18
Activity
Comments
Similar Projects
Uyuni Health-check Grafana Troubleshooter by ygutierrez
Description
This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.
Goals
- Investigate if and how the
grafana-llm-appplug-in can be used within the Uyuni Health-check tool. - Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
- Evaluate support for local LLMs and external APIs through the plugin.
- Evaluate if and how the Uyuni MCP server could be integrated as another source of information.
Resources
Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios
Description
Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.
This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.
The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.
Goals
By the end of Hack Week, we aim to have a single, working Python script that:
- Connects to Prometheus and executes a query to fetch detailed test failure history.
- Processes the raw data into a format suitable for the Gemini API.
- Successfully calls the Gemini API with the data and a clear prompt.
- Parses the AI's response to extract a simple list of flaky tests.
- Saves the list to a JSON file that can be displayed in Grafana.
- New panel in our Dashboard listing the Flaky tests
Resources
- Jenkins Prometheus Exporter: https://github.com/uyuni-project/jenkins-exporter/
- Data Source: Our internal Prometheus server.
- Key Metric:
jenkins_build_test_case_failure_age{jobname, buildid, suite, case, status, failedsince}. - Existing Query for Reference:
count by (suite) (max_over_time(jenkins_build_test_case_failure_age{status=~"FAILED|REGRESSION", jobname="$jobname"}[$__range])). - AI Model: The Google Gemini API.
- Example about how to interact with Gemini API: https://github.com/srbarrios/FailTale/
- Visualization: Our internal Grafana Dashboard.
- Internal IaC: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring
go-git: unlocking SHA256-based repository cloning ahead of git v3 by pgomes
Description
The go-git library implements the git internals in pure Go, so that any Go application can handle not only Git repositories, but also lower-level primitives (e.g. packfiles, idxfiles, etc) without needing to shell out to the git binary.
The focus for this Hackweek is to fast track key improvements for the project ahead of the upstream release of Git V3, which may take place at some point next year.
Goals
- Add support for cloning SHA256 repositories.
- Decrease memory churn for very large repositories (e.g. Linux Kernel repository).
- Cut the first alpha version for
go-git/v6.
Stretch goals
- Review and update the official documentation.
- Optimise use of go-git in Fleet.
- Create RFC/example for go-git plugins to improve extensibility.
Resources
- https://github.com/go-git/go-git/
- https://go-git.github.io/docs/
Create a Cloud-Native policy engine with notifying capabilities to optimize resource usage by gbazzotti
Description
The goal of this project is to begin the initial phase of development of an all-in-one Cloud-Native Policy Engine that notifies resource owners when their resources infringe predetermined policies. This was inspired by a current issue in the CES-SRE Team where other solutions seemed to not exactly correspond to the needs of the specific workloads running on the Public Cloud Team space.
The initial architecture can be checked out on the Repository listed under Resources.
Goals
- Create a minimal working prototype following the workflow specified on the documentation
- Provide instructions on installation/usage
- Work on email notifying capabilities
Resources
- TBD
terraform-provider-feilong by e_bischoff
Project Description
People need to test operating systems and applications on s390 platform.
Installation from scratch solutions include:
- just deploy and provision manually
(with the help of ftpbootscript, if you are at SUSE) - use
s3270terminal emulation (used byopenQApeople?) - use
LXCfrom IBM to start CP commands and analyze the results - use
zPXEto do some PXE-alike booting (used by theorthosteam?) - use
tessiato install from scratch using autoyast - use
libvirtfor s390 to do some nested virtualization on some already deployed z/VM system - directly install a Linux kernel on a LPAR and use
kvm+libvirtfrom there
Deployment from image solutions include:
- use
ICICweb interface (openstackin disguise, contributed by IBM) - use
ICICfrom theopenstackterraformprovider (used byRancherQA) - use
zvm_ansibleto controlSMAPI - connect directly to
SMAPIlow-level socket interface
IBM Cloud Infrastructure Center (ICIC) harnesses the Feilong API, but you can use Feilong without installing ICIC, provided you set up a "z/VM cloud connector" into one of your VMs following this schema.
What about writing a terraform Feilong provider, just like we have the terraform libvirt provider? That would allow to transparently call Feilong from your main.tf files to deploy and destroy resources on your system/z.
Other Feilong-based solutions include:
- make
libvirtFeilong-aware - simply call
Feilongfrom shell scripts withcurl - use
zvmconnectorclient python library from Feilong - use
zthinpart of Feilong to directly commandSMAPI.
Goal for Hackweek 23
My final goal is to be able to easily deploy and provision VMs automatically on a z/VM system, in a way that people might enjoy even outside of SUSE.
My technical preference is to write a terraform provider plugin, as it is the approach that involves the least software components for our deployments, while remaining clean, and compatible with our existing development infrastructure.
Goals for Hackweek 24
Feilong provider works and is used internally by SUSE Manager team. Let's push it forward!
Let's add support for fiberchannel disks and multipath.
Possible goals for Hackweek 25
Modernization, maturity, and maintenance.
Rewrite Distrobox in go (POC) by fabriziosestito
Description
Rewriting Distrobox in Go.
Main benefits:
- Easier to maintain and to test
- Adapter pattern for different container backends (LXC, systemd-nspawn, etc.)
Goals
- Build a minimal starting point with core commands
- Keep the CLI interface compatible: existing users shouldn't notice any difference
- Use a clean Go architecture with adapters for different container backends
- Keep dependencies minimal and binary size small
- Benchmark against the original shell script
Resources
- Upstream project: https://github.com/89luca89/distrobox/
- Distrobox site: https://distrobox.it/
- ArchWiki: https://wiki.archlinux.org/title/Distrobox
Play with the userfaultfd(2) system call and download on demand using HTTP Range Requests with Golang by rbranco
Description
The userfaultfd(2) is a cool system call to handle page faults in user-space. This should allow me to list the contents of an ISO or similar archive without downloading the whole thing. The userfaultfd(2) part can also be done in theory with the PROT_NONE mprotect + SIGSEGV trick, for complete Unix portability, though reportedly being slower.
Goals
- Create my own library for userfaultfd(2) in Golang.
- Create my own library for HTTP Range Requests.
- Complete portability with Unix.
- Benchmarks.
- Contribute some tests to LTP.
Resources
- https://docs.kernel.org/admin-guide/mm/userfaultfd.html
- https://github.com/loopholelabs/userfaultfd-go
- https://github.com/DHowett/ranger
- https://www.cons.org/cracauer/cracauer-userfaultfd.html