SUSE Hack Week: Add engineering metrics to telegraf webhooks plugin

How it is

Currently the telegraf webhooks plugin for github produces rather dumb measurements. Just extracting data from the webhook and putting it into the TSDB. The interesting data for engineering metrics you have to calculate yourself then.

How it should be

Given a set of input variables (list of accounts and labels) the plugin should calculate

cycle time (the time between certain PRs open and deploy)
change failure rate (the amount of opened PRs divided by the PRs with label)
change quality (the amount of issues with priority labels)

How it would be nice

While doing this, think a more about the data models for other use-cases.

Join this project Leave this project

Looking for hackers with the skills:

golang influxdb grafana metrics

This project is part of:

Hack Week 20

Activity

over 4 years ago: hennevogel added keyword "influxdb" to this project.

over 4 years ago: hennevogel added keyword "grafana" to this project.

over 4 years ago: hennevogel added keyword "metrics" to this project.

over 4 years ago: hennevogel removed keyword observeability from this project.

over 4 years ago: hennevogel added keyword "golang" to this project.

over 4 years ago: admehmood joined this project.

over 4 years ago: admehmood liked this project.

over 4 years ago: hennevogel added keyword "observeability" to this project.

over 4 years ago: hennevogel started this project.

over 4 years ago: hennevogel originated this project.

Comments

over 4 years ago by hennevogel | Reply

Hello World

Similar Projects

golang

Mammuthus - The NFS-Ganesha inside Kubernetes controller by vcheng

Description

As the user-space NFS provider, the NFS-Ganesha is wieldy use with serval projects. e.g. Longhorn/Rook. We want to create the Kubernetes Controller to make configuring NFS-Ganesha easy. This controller will let users configure NFS-Ganesha through different backends like VFS/CephFS.

Goals

Create NFS-Ganesha Package on OBS: nfs-ganesha5, nfs-ganesha6
Create NFS-Ganesha Container Image on OBS: Image
Create a Kubernetes controller for NFS-Ganesha and support the VFS configuration on demand. Mammuthus

Resources

NFS-Ganesha

terraform-provider-feilong by e_bischoff

Project Description

People need to test operating systems and applications on s390 platform.

Installation from scratch solutions include:

just deploy and provision manually (with the help of ftpboot script, if you are at SUSE)
use s3270 terminal emulation (used by openQA people?)
use LXC from IBM to start CP commands and analyze the results
use zPXE to do some PXE-alike booting (used by the orthos team?)
use tessia to install from scratch using autoyast
use libvirt for s390 to do some nested virtualization on some already deployed z/VM system
directly install a Linux kernel on a LPAR and use kvm + libvirt from there

Deployment from image solutions include:

use ICIC web interface (openstack in disguise, contributed by IBM)
use ICIC from the openstack terraform provider (used by Rancher QA)
use zvm_ansible to control SMAPI
connect directly to SMAPI low-level socket interface

IBM Cloud Infrastructure Center (ICIC) harnesses the Feilong API, but you can use Feilong without installing ICIC, provided you set up a "z/VM cloud connector" into one of your VMs following this schema.

What about writing a terraform Feilong provider, just like we have the terraform libvirt provider? That would allow to transparently call Feilong from your main.tf files to deploy and destroy resources on your system/z.

Goal for Hackweek 23

My final goal is to be able to easily deploy and provision VMs automatically on a z/VM system, in a way that people might enjoy even outside of SUSE.

My technical preference is to write a terraform provider plugin, as it is the approach that involves the least software components for our deployments, while remaining clean, and compatible with our existing development infrastructure.

Goals for Hackweek 24

Feilong provider works and is used internally by SUSE Manager team. Let's push it forward!

Let's add support for fiberchannel disks and multipath.

Possible goals for Hackweek 25

Modernization, maturity, and maintenance.

grafana

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

Goals

By the end of Hack Week, we aim to have a single, working Python script that:

Connects to Prometheus and executes a query to fetch detailed test failure history.
Processes the raw data into a format suitable for the Gemini API.
Successfully calls the Gemini API with the data and a clear prompt.
Parses the AI's response to extract a simple list of flaky tests.
Saves the list to a JSON file that can be displayed in Grafana.
New panel in our Dashboard listing the Flaky tests

Resources

Jenkins Prometheus Exporter: https://github.com/uyuni-project/jenkins-exporter/
Data Source: Our internal Prometheus server.
Key Metric: jenkins_build_test_case_failure_age{jobname, buildid, suite, case, status, failedsince}.
Existing Query for Reference: count by (suite) (max_over_time(jenkins_build_test_case_failure_age{status=~"FAILED|REGRESSION", jobname="$jobname"}[$__range])).
AI Model: The Google Gemini API.
Example about how to interact with Gemini API: https://github.com/srbarrios/FailTale/
Visualization: Our internal Grafana Dashboard.
Internal IaC: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring

How it is

How it should be

How it would be nice

Looking for hackers with the skills:

This project is part of:

Activity

Comments

over 4 years ago by hennevogel | Reply

Similar Projects

golang

Mammuthus - The NFS-Ganesha inside Kubernetes controller by vcheng

Description

Goals

Resources

terraform-provider-feilong by e_bischoff

Project Description

Goal for Hackweek 23

Goals for Hackweek 24

Possible goals for Hackweek 25

grafana

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Goals

Resources