SUSE Hack Week: Artificial Intelligence playground for Data Scientist

Project here: https://confluence.suse.com/display/AAI/HackWeek19 Will keep working out of HackWeek as "best effort" personal project to make it evolve and keep learning.

What this project is about?

Data Scientist ofter starts working on their laptop before moving into company resources. As in many other cases they have to solve many challenges by themselves before actually start working on "their stuff". The idea is to build a prototype we will eventually try to evolve in a product that answers the following pre-requisites:

Rapid Time to work: I, as Data Scientist or Data Engineer, need to install the playground quickly and be ready to work
Everything at the right place: I as Data Scientist or Data Engineer want an easy way to find things and use them
No time to waste: I as Data Scientist or Data Engineer want to be able to replicate the model synchronizing it with another infrastructure through a "click and done" model
No complexity rule: I as Data Scientist or Data Engineer want to avoid waste time in complex configurations or debug things. Complexity needs to hided to me

Project Team requirements

Because this is a first attempt to prototype I have to ask for some "not official" rules to be applied:

Max 7/9 people in the team with a max of 3 Engineers
If you apply you have to make yourself available from 10 am to 5 pm CET (if you're on a different time zone you have to consider we'll have a lot of team discussion so could be challenging)
This is a 5 days sprint approach where everyone needs to be open, collaborative, bold, creative.

FAQ

I'm not an engineer or an expert: Great this project require (possibly) at least 1 person from marketing, sales-engineering, services, support
Am I required to code?: No, but you're required to share your ideas and views, while the end goal is to build a prototype (that's why we need a couple of engineers) the scope is to have something to show and demonstrate we may build something useful for the Data Scientist community
Woah this seems to be a super serious project: Nah it's a fun experiment to learn how much we may push our limit through rapid prototyping and "be different"
So how do I signup?: easy just join the team here on hackweek and/or contact me alessandro.festa@suse.com for further details.

Looking for hackers with the skills:

ai artificial-intelligence machinelearning prototype agile projectmanagement innovation

This project is part of:

Hack Week 19

Activity

over 5 years ago: jordimassaguerpla liked this project.

over 5 years ago: rsblendido joined this project.

over 5 years ago: jeffpr joined this project.

over 5 years ago: FSzekely liked this project.

over 5 years ago: bfilho left this project.

over 5 years ago: bfilho joined this project.

over 5 years ago: bfromme liked this project.

over 5 years ago: bfromme joined this project.

over 5 years ago: rsblendido liked this project.

over 5 years ago: gboiko liked this project.

almost 6 years ago: afesta added keyword "innovation" to this project.

almost 6 years ago: afesta added keyword "projectmanagement" to this project.

almost 6 years ago: afesta added keyword "ai" to this project.

almost 6 years ago: afesta added keyword "artificial-intelligence" to this project.

almost 6 years ago: afesta added keyword "machinelearning" to this project.

almost 6 years ago: afesta added keyword "prototype" to this project.

almost 6 years ago: afesta added keyword "agile" to this project.

almost 6 years ago: afesta liked this project.

almost 6 years ago: afesta started this project.

almost 6 years ago: afesta originated this project.

Comments

almost 6 years ago by hennevogel | Reply

Can you explain what kind of output you would expect? Like an application? A set of packages? Some IaC description?
- almost 6 years ago by afesta | Reply
  
  This is something we have to decide during the hack week, usually a prototype based on a target of the challenge decided by the team. If this will be simple artifacts made of a sum of existing items, an application or a set of packages has to be decided. The scope is to foster innovation under a very fast cycle (5 days) and get a result that allows us to learn if: is doable, what we need to address to make it a real product and how long could take. Don't expect huge development or impossible challenges, this is about pure innovation and ideas.. and build a way to demonstrate our idea.

almost 6 years ago by afesta | Reply

This is something we have to decide during the hack week, usually a prototype based on a target of the challenge decided by the team. If this will be simple artifacts made of a sum of existing items, an application or a set of packages has to be decided. The scope is to foster innovation under a very fast cycle (5 days) and get a result that allows us to learn if: is doable, what we need to address to make it a real product and how long could take. Don't expect huge development or impossible challenges, this is about pure innovation and ideas.. and build a way to demonstrate our idea.

over 5 years ago by bmwiedemann | Reply

If you have a need for this project for 2x NVIDIA Tesla T4, 16GB - ping me.

over 5 years ago by afesta | Reply

So cool! To be honest I'll like more to use your brain for the project...willing to give me a chance and have fun for a week with this crazy PM?

over 5 years ago by rsblendido | Reply

Is this about Kubeflow?
- over 5 years ago by afesta | Reply
  
  Could be. I mean the only "constraint" is that ideally should work on a laptop and Kubeflow works on K8's but if you use something like MLRun you may overcome many challenges. The ultimate goal of the project is to provide Data scientists a playground so that they do not need to learn and install and configure everything but it's easy enough to start from your laptop (and eventually) move it to a server/cloud environment.

over 5 years ago by jeffpr | Reply

@afesta : I will be working with you for the SUSEcon demos - just thought I would hop in here when I can.

over 5 years ago by afesta | Reply

Cool!

Similar Projects

ai

SUSE Observability MCP server by drutigliano

Description

The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

Goals

Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

Resources

https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
https://www.datadoghq.com/blog/datadog-remote-mcp-server
https://modelcontextprotocol.io/specification/2025-06-18/index

Basic implementation

https://github.com/drutigliano19/suse-observability-mcp-server

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.

This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.

The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.

Goals

By the end of Hack Week, we aim to have a single, working Python script that:

Connects to Prometheus and executes a query to fetch detailed test failure history.
Processes the raw data into a format suitable for the Gemini API.
Successfully calls the Gemini API with the data and a clear prompt.
Parses the AI's response to extract a simple list of flaky tests.
Saves the list to a JSON file that can be displayed in Grafana.
New panel in our Dashboard listing the Flaky tests

Resources

Jenkins Prometheus Exporter: https://github.com/uyuni-project/jenkins-exporter/
Data Source: Our internal Prometheus server.
Key Metric: jenkins_build_test_case_failure_age{jobname, buildid, suite, case, status, failedsince}.
Existing Query for Reference: count by (suite) (max_over_time(jenkins_build_test_case_failure_age{status=~"FAILED|REGRESSION", jobname="$jobname"}[$__range])).
AI Model: The Google Gemini API.
Example about how to interact with Gemini API: https://github.com/srbarrios/FailTale/
Visualization: Our internal Grafana Dashboard.
Internal IaC: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring

artificial-intelligence

SUSE Observability MCP server by drutigliano

Description

Goals

Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

Resources

https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
https://www.datadoghq.com/blog/datadog-remote-mcp-server
https://modelcontextprotocol.io/specification/2025-06-18/index

Basic implementation

https://github.com/drutigliano19/suse-observability-mcp-server

What this project is about?

Project Team requirements

FAQ

Looking for hackers with the skills:

This project is part of:

Activity

Comments

almost 6 years ago by hennevogel | Reply

almost 6 years ago by afesta | Reply

almost 6 years ago by afesta | Reply

over 5 years ago by bmwiedemann | Reply

over 5 years ago by afesta | Reply

over 5 years ago by rsblendido | Reply

over 5 years ago by afesta | Reply

over 5 years ago by jeffpr | Reply

over 5 years ago by afesta | Reply

Similar Projects

ai

SUSE Observability MCP server by drutigliano

Description

Goals

Resources

Basic implementation

Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios

Description

Goals

Resources

artificial-intelligence

SUSE Observability MCP server by drutigliano

Description

Goals

Resources

Basic implementation