I once had a bad dream.

I started good, a sunny day. I had just fixed an issue and push it to my fork, in order to create a Pull Request. I was happy. It felt awesome to have found a fix so elegant. Two lines of code.

But then, something happened. A cloud appear in the sky and partially hide the sun. github triggered the end to end test suite. You could see the waiting icon, you could almost hear the engines starting, ... a lightning and a thunder appeared in the sky, the sky turned dark and the e2e test started in our private jenkins server.

1 hour passed by... 2 hours .... the e2e tests still running ... 4 hours, not even 50%. Finally, I had to go, get some dinner, get some rest, so I decided to look at it the next day.

Sleep was not good. Dreaming the tests would fail and the deadline be missed .... I couldn't sleep no more so I wake up early in the morning, before sunrise. The computer still opened was lightning up the room. Out, the storm turned out to be a heavy rain. And the tests did not passed! 4 hours and half after starting them, so half an hour after leaving for dinner, and there was a fail test.

Finding out what went wrong took 2 hours...it was not even related to the code, but an infrastructure issue. A timeout when connecting to a mirror which was being restarted when the tests run. Apparently we hit a maintenance window.

Damn! So let's just "restart" the tests. This time, I decided to put an alarm 4 hours after and switch to another task, just to avoid the anxiety of the previous evening looking at the results. 4 hours passed by, and the tests are good. It is not raining anymore, a git of sun is filtering throw the window, hope is in the air. 4 more hours, and the tests still have not failed. Going to dinner and bed again and putting the alarm early in the morning. First thing in the morning I looked at the tests ... running again, feels good. All tests have passed and there is only one final "cleanup" state. Shower, breakfeast, sun is shining again!

Finally, all is green, so let's go and push the merge button ... oh oh ... clouds hide the sun again ... where is the merge button?? no way, there is a conflict with the code in master and I can't merge!!! I can't merge!!! shocked, I looked into the history ... John merged a PR overnight that was touching the same file ... hopeless I start crying on my desk...

Scared?

This is not so different on what could happen if we were running the whole e2e test suite in the SUSE Manager (uyuni) Pull Requests. However, not running any e2e tests has also a bad consequence. We find the issues after code has been merged in master, and then we spend days looking at what could have caused this, reverting commits, and starting again.

However, if we could run a subset of the e2e tests, we could shorten the time it takes and so run them at the PR level, so no broken code could get merged.

The thing is , how do we select which tests to run?

This project is about using Machine Learning to do predictive test selection, thus selecting which tests to run based on the history of previous test runs.

Inspired by: https://engineering.fb.com/2018/11/21/developer-tools/predictive-test-selection/

Looking for hackers with the skills:

ml ci qa ai

This project is part of:

Hack Week 20

Activity

  • almost 5 years ago: PSuarezHernandez liked this project.
  • almost 5 years ago: llansky3 liked this project.
  • almost 5 years ago: ories liked this project.
  • almost 5 years ago: j_renner liked this project.
  • almost 5 years ago: RDiasMateus liked this project.
  • almost 5 years ago: pagarcia liked this project.
  • almost 5 years ago: jordimassaguerpla added keyword "ci" to this project.
  • almost 5 years ago: jordimassaguerpla added keyword "qa" to this project.
  • almost 5 years ago: jordimassaguerpla added keyword "ai" to this project.
  • almost 5 years ago: jordimassaguerpla added keyword "ml" to this project.
  • almost 5 years ago: jordimassaguerpla originated this project.

  • Comments

    • llansky3
      almost 5 years ago by llansky3 | Reply

      This could have some wider potential - clever picking of tests to shorten the time yet maximize the findings (and learning that in time based on test execution history to predict). I can see how that saves tons of money in cases where test on a real industrial machines need to be executed (e.g. cost of running engine on a test bed is 5-10k$ per day so there is huge difference if you need to run for 1 or 3 days)

    Similar Projects

    Kubernetes-Based ML Lifecycle Automation by lmiranda

    Description

    This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.

    The pipeline will automate the lifecycle of a machine learning model, including:

    • Data ingestion/collection
    • Model training as a Kubernetes Job
    • Model artifact storage in an S3-compatible registry (e.g. Minio)
    • A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
    • A lightweight inference service that loads and serves the latest model
    • Monitoring of model performance and service health through Prometheus/Grafana

    The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.

    Goals

    By the end of Hack Week, the project should:

    1. Produce a fully functional ML pipeline running on Kubernetes with:

      • Data collection job
      • Training job container
      • Storage and versioning of trained models
      • Automated deployment of new model versions
      • Model inference API service
      • Basic monitoring dashboards
    2. Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.

    3. Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).

    4. Prepare a short demo explaining the end-to-end process and how new models flow through the system.

    Resources

    Project Repository

    Updates

    1. Training pipeline and datasets
    2. Inference Service py


    issuefs: FUSE filesystem representing issues (e.g. JIRA) for the use with AI agents code-assistants by llansky3

    Description

    Creating a FUSE filesystem (issuefs) that mounts issues from various ticketing systems (Github, Jira, Bugzilla, Redmine) as files to your local file system.

    And why this is good idea?

    • User can use favorite command line tools to view and search the tickets from various sources
    • User can use AI agents capabilities from your favorite IDE or cli to ask question about the issues, project or functionality while providing relevant tickets as context without extra work.
    • User can use it during development of the new features when you let the AI agent to jump start the solution. The issuefs will give the AI agent the context (AI agents just read few more files) about the bug or requested features. No need for copying and pasting issues to user prompt or by using extra MCP tools to access the issues. These you can still do but this approach is on purpose different.

    Goals

    1. Add Github issue support
    2. Proof the concept/approach by apply the approach on itself using Github issues for tracking and development of new features
    3. Add support for Bugzilla and Redmine using this approach in the process of doing it. Record a video of it.
    4. Clean-up and test the implementation and create some documentation
    5. Create a blog post about this approach

    Resources

    There is a prototype implementation here. This currently sort of works with JIRA only.


    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

    Description

    Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

    This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

    The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

    Goals

    By the end of Hack Week, we aim to have a single, working Python script that:

    1. Connects to Prometheus and executes a query to fetch detailed test failure history.
    2. Processes the raw data into a format suitable for the Gemini API.
    3. Successfully calls the Gemini API with the data and a clear prompt.
    4. Parses the AI's response to extract a simple list of flaky tests.
    5. Saves the list to a JSON file that can be displayed in Grafana.
    6. New panel in our Dashboard listing the Flaky tests

    Resources

    Outcome


    Docs Navigator MCP: SUSE Edition by mackenzie.techdocs

    MCP Docs Navigator: SUSE Edition

    Description

    Docs Navigator MCP: SUSE Edition is an AI-powered documentation navigator that makes finding information across SUSE, Rancher, K3s, and RKE2 documentation effortless. Built as a Model Context Protocol (MCP) server, it enables semantic search, intelligent Q&A, and documentation summarization using 100% open-source AI models (no API keys required!). The project also allows you to bring your own keys from Anthropic and Open AI for parallel processing.

    Goals

    • [ X ] Build functional MCP server with documentation tools
    • [ X ] Implement semantic search with vector embeddings
    • [ X ] Create user-friendly web interface
    • [ X ] Optimize indexing performance (parallel processing)
    • [ X ] Add SUSE branding and polish UX
    • [ X ] Stretch Goal: Add more documentation sources
    • [ X ] Stretch Goal: Implement document change detection for auto-updates

    Coming Soon!

    • Community Feedback: Test with real users and gather improvement suggestions

    Resources


    Is SUSE Trending? Popularity and Developer Sentiment Insight Using Native AI Capabilities by terezacerna

    Description

    This project aims to explore the popularity and developer sentiment around SUSE and its technologies compared to Red Hat and their technologies. Using publicly available data sources, I will analyze search trends, developer preferences, repository activity, and media presence. The final outcome will be an interactive Power BI dashboard that provides insights into how SUSE is perceived and discussed across the web and among developers.

    Goals

    1. Assess the popularity of SUSE products and brand compared to Red Hat using Google Trends.
    2. Analyze developer satisfaction and usage trends from the Stack Overflow Developer Survey.
    3. Use the GitHub API to compare SUSE and Red Hat repositories in terms of stars, forks, contributors, and issue activity.
    4. Perform sentiment analysis on GitHub issue comments to measure community tone and engagement using built-in Copilot capabilities.
    5. Perform sentiment analysis on Reddit comments related to SUSE technologies using built-in Copilot capabilities.
    6. Use Gnews.io to track and compare the volume of news articles mentioning SUSE and Red Hat technologies.
    7. Test the integration of Copilot (AI) within Power BI for enhanced data analysis and visualization.
    8. Deliver a comprehensive Power BI report summarizing findings and insights.
    9. Test the full potential of Power BI, including its AI features and native language Q&A.

    Resources

    1. Google Trends: Web scraping for search popularity data
    2. Stack Overflow Developer Survey: For technology popularity and satisfaction comparison
    3. GitHub API: For repository data (stars, forks, contributors, issues, comments).
    4. Gnews.io API: For article volume and mentions analysis.
    5. Reddit: SUSE related topics with comments.