a project by PSuarezHernandez
Description
Using Ollama you can easily run different LLM models in your local computer. This project is about exploring Ollama, testing different LLMs and try to fine tune them. Also, explore potential ways of integration with Uyuni.
Goals
- Explore Ollama
- Test different models
- Fine tuning
- Explore possible integration in Uyuni
Resources
- https://ollama.com/
- https://huggingface.co/
- https://apeatling.com/articles/part-2-building-your-training-data-for-fine-tuning/
This project is part of:
Hack Week 24
Activity
Comments
-
about 1 year ago by PSuarezHernandez | Reply
Some conclusions after Hackweek 24:
- ollama + open-webui is a nice combo to allow running LLMs locally (tried also Local AI)
- open-webui allows you to add custom knoweldge bases (collections) to feed models.
- Uyuni documentation, Salt documentation can be used on this collections to make models to learn.
- Using a tailored documentation works better to feed models.
- Tried different models: llama3.1, mistral, mistral-nemo, gemma2, phi3,..
- Getting promising results, particularly with
mistral-nemo.. but also getting model hallutinations - model parameters can be adjusted to reduce them.
Takeaways
- Small models runs fairly well with CPU only.
- Making an expert assistance on Uyuni, with an extensive knowledge based on documentation, might be something to keep exploring.
Next steps
- Make the model to understand Uyuni API, so it is able to translate user requests to actual call to Uyuni API.
-
4 months ago by rudrakshkarpe | Reply
Hi @PSuarezHernandez ,
will this project be part of Hackweek 2025?
Similar Projects
Enhance setup wizard for Uyuni by PSuarezHernandez
Description
This project wants to enhance the intial setup on Uyuni after its installation, so it's easier for a user to start using with it.
Uyuni currently uses "uyuni-tools" (mgradm) as the installation entrypoint, to trigger the installation of Uyuni in the given host, but does not really perform an initial setup, for instance:
- user creation
- adding products / channels
- generating bootstrap repos
- create activation keys
- ...
Goals
- Provide initial setup wizard as part of mgradm uyuni installation
Resources
Move Uyuni Test Framework from Selenium to Playwright + AI by oscar-barrios

Description
This project aims to migrate the existing Uyuni Test Framework from Selenium to Playwright. The move will improve the stability, speed, and maintainability of our end-to-end tests by leveraging Playwright's modern features. We'll be rewriting the current Selenium code in Ruby to Playwright code in TypeScript, which includes updating the test framework runner, step definitions, and configurations. This is also necessary because we're moving from Cucumber Ruby to CucumberJS.
If you're still curious about the AI in the title, it was just a way to grab your attention. Thanks for your understanding.
Nah, let's be honest
AI helped a lot to vibe code a good part of the Ruby methods of the Test framework, moving them to Typescript, along with the migration from Capybara to Playwright. I've been using "Cline" as plugin for WebStorm IDE, using Gemini API behind it.
Goals
- Migrate Core tests including Onboarding of clients
- Improve test reliabillity: Measure and confirm a significant reduction of flakiness.
- Implement a robust framework: Establish a well-structured and reusable Playwright test framework using the CucumberJS
Resources
- Existing Uyuni Test Framework (Cucumber Ruby + Capybara + Selenium)
- My Template for CucumberJS + Playwright in TypeScript
- Started Hackweek Project
Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil
Join the Gitter channel! https://gitter.im/uyuni-project/hackweek
Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!
Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.
For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.
No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)
The idea is testing Salt (including bootstrapping with bootstrap script) and Salt-ssh clients
To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):
- Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
- Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
- Package management (install, remove, update...)
- Patching
- Applying any basic salt state (including a formula)
- Salt remote commands
- Bonus point: Java part for product identification, and monitoring enablement
- Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
- Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
- Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)
If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)
- If you don't have knowledge about some of the steps: ask the team
- If you still don't know what to do: switch to another distribution and keep testing.
This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)
In progress/done for Hack Week 25
Guide
We started writin a Guide: Adding a new client GNU Linux distribution to Uyuni at https://github.com/uyuni-project/uyuni/wiki/Guide:-Adding-a-new-client-GNU-Linux-distribution-to-Uyuni, to make things easier for everyone, specially those not too familiar wht Uyuni or not technical.
openSUSE Leap 16.0
The distribution will all love!
https://en.opensuse.org/openSUSE:Roadmap#DRAFTScheduleforLeap16.0
Curent Status We started last year, it's complete now for Hack Week 25! :-D
[W]Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file) NOTE: Done, client tools for SLMicro6 are using as those for SLE16.0/openSUSE Leap 16.0 are not available yet[W]Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)[W]Package management (install, remove, update...). Works, even reboot requirement detection
Uyuni Saltboot rework by oholecek
Description
When Uyuni switched over to the containerized proxies we had to abandon salt based saltboot infrastructure we had before. Uyuni already had integration with a Cobbler provisioning server and saltboot infra was re-implemented on top of this Cobbler integration.
What was not obvious from the start was that Cobbler, having all it's features, woefully slow when dealing with saltboot size environments. We did some improvements in performance, introduced transactions, and generally tried to make this setup usable. However the underlying slowness remained.
Goals
This project is not something trying to invent new things, it is just finally implementing saltboot infrastructure directly with the Uyuni server core.
Instead of generating grub and pxelinux configurations by Cobbler for all thousands of systems and branches, we will provide a GET access point to retrieve grub or pxelinux file during the boot:
/saltboot/group/grub/$fqdn and similar for systems /saltboot/system/grub/$mac
Next we adapt our tftpd translator to query these points when asked for default or mac based config.
Lastly similar thing needs to be done on our apache server when HTTP UEFI boot is used.
Resources
Uyuni read-only replica by cbosdonnat
Description
For now, there is no possible HA setup for Uyuni. The idea is to explore setting up a read-only shadow instance of an Uyuni and make it as useful as possible.
Possible things to look at:
- live sync of the database, probably using the WAL. Some of the tables may have to be skipped or some features disabled on the RO instance (taskomatic, PXT sessions…)
- Can we use a load balancer that routes read-only queries to either instance and the other to the RW one? For example, packages or PXE data can be served by both, the API GET requests too. The rest would be RW.
Goals
- Prepare a document explaining how to do it.
- PR with the needed code changes to support it
Extended private brain - RAG my own scripts and data into offline LLM AI by tjyrinki_suse
Description
For purely studying purposes, I'd like to find out if I could teach an LLM some of my own accumulated knowledge, to use it as a sort of extended brain.
I might use qwen3-coder or something similar as a starting point.
Everything would be done 100% offline without network available to the container, since I prefer to see when network is needed, and make it so it's never needed (other than initial downloads).
Goals
- Learn something about RAG, LLM, AI.
- Find out if everything works offline as intended.
- As an end result have a new way to access my own existing know-how, but so that I can query the wisdom in them.
- Be flexible to pivot in any direction, as long as there are new things learned.
Resources
To be found on the fly.
Timeline
Day 1 (of 4)
- Tried out a RAG demo, expanded on feeding it my own data
- Experimented with qwen3-coder to add a persistent chat functionality, and keeping vectors in a pickle file
- Optimizations to keep everything within context window
- Learn and add a bit of PyTest
Day 2
- More experimenting and more data
- Study ChromaDB
- Add a Web UI that works from another computer even though the container sees network is down
Day 3
- The above RAG is working well enough for demonstration purposes.
- Pivot to trying out OpenCode, configuring local Ollama qwen3-coder there, to analyze the RAG demo.
- Figured out how to configure Ollama template to be usable under OpenCode. OpenCode locally is super slow to just running qwen3-coder alone.
Day 4 (final day)
- Battle with OpenCode that was both slow and kept on piling up broken things.
- Call it success as after all the agentic AI was working locally.
- Clean up the mess left behind a bit.
Blog Post
Summarized the findings at blog post.
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Golang MCP server with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
Results
Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.
Example execution
"what is it" file and directory analysis via MCP and local LLM, for console and KDE by rsimai
Description
Users sometimes wonder what files or directories they find on their local PC are good for. If they can't determine from the filename or metadata, there should an easy way to quickly analyze the content and at least guess the meaning. An LLM could help with that, through the use of a filesystem MCP and to-text-converters for typical file types. Ideally this is integrated into the desktop environment but works as well from a console. All data is processed locally or "on premise", no artifacts remain or leave the system.
Goals
- The user can run a command from the console, to check on a file or directory
- The filemanager contains the "analyze" feature within the context menu
- The local LLM could serve for other use cases where privacy matters
TBD
- Find or write capable one-shot and interactive MCP client
- Find or write simple+secure file access MCP server
- Create local LLM service with appropriate footprint, containerized
- Shell command with options
- KDE integration (Dolphin)
- Package
- Document
Resources
issuefs: FUSE filesystem representing issues (e.g. JIRA) for the use with AI agents code-assistants by llansky3
Description
Creating a FUSE filesystem (issuefs) that mounts issues from various ticketing systems (Github, Jira, Bugzilla, Redmine) as files to your local file system.
And why this is good idea?
- User can use favorite command line tools to view and search the tickets from various sources
- User can use AI agents capabilities from your favorite IDE or cli to ask question about the issues, project or functionality while providing relevant tickets as context without extra work.
- User can use it during development of the new features when you let the AI agent to jump start the solution. The issuefs will give the AI agent the context (AI agents just read few more files) about the bug or requested features. No need for copying and pasting issues to user prompt or by using extra MCP tools to access the issues. These you can still do but this approach is on purpose different.

Goals
- Add Github issue support
- Proof the concept/approach by apply the approach on itself using Github issues for tracking and development of new features
- Add support for Bugzilla and Redmine using this approach in the process of doing it. Record a video of it.
- Clean-up and test the implementation and create some documentation
- Create a blog post about this approach
Resources
There is a prototype implementation here. This currently sort of works with JIRA only.
Creating test suite using LLM on existing codebase of a solar router by fcrozat
Description
Two years ago, I evaluated solar routers as part of hackweek24, I've assembled one and it is running almost smoothly.
However, its code quality is not perfect and the codebase doesn't have any testcase (which is tricky, since it is embedded code and rely on getting external data to react).
Before improving the code itself, a testsuite should be created to ensure code additional don't cause regression.
Goals
Create a testsuite, allowing to test solar router code in a virtual environment. Using LLM to help to create this test suite.
If succesful, try to improve the codebase itself by having it reviewed by LLM.
Resources
Try out Neovim Plugins supporting AI Providers by enavarro_suse
Description
Experiment with several Neovim plugins that integrate AI model providers such as Gemini and Ollama.
Goals
Evaluate how these plugins enhance the development workflow, how they differ in capabilities, and how smoothly they integrate into Neovim for day-to-day coding tasks.
Resources
- Neovim 0.11.5
- AI-enabled Neovim plugins:
- avante.nvim: https://github.com/yetone/avante.nvim
- Gp.nvim: https://github.com/Robitx/gp.nvim
- parrot.nvim: https://github.com/frankroeder/parrot.nvim
- gemini.nvim: https://dotfyle.com/plugins/kiddos/gemini.nvim
- ...
- Accounts or API keys for AI model providers.
- Local model serving setup (e.g., Ollama)
- Test projects or codebases for practical evaluation:
- OBS: https://build.opensuse.org/
- OBS blog and landing page: https://openbuildservice.org/
- ...
Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak
Description
git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.
Supports:
- Verifying commit authenticity signed using gpg key
- Checking out trusted commits
Ideal for teams and projects where the integrity of git history is crucial.
Goals
A minimal python code of the shell script exists as a pull request.
The goal of this hackweek is to:
- DONE: Add more unit tests
- New and more tests can be added later
- New and more tests can be added later
- Partially DONE: Make the python code modular
- DONE: Add code coverage if possible
Resources
- Link to GitHub Repository: https://github.com/openSUSE/git-sha-verify
Song Search with CLAP by gcolangiuli
Description
Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface
Goals
Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:
- Music Tagging;
- Free text search;
- Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.
The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.
Result
In this MVP we implemented:
- Async Song Analysis with Clap model
- Free Text Search of the songs
- Similar song search based on vector representation
- Containerised version with web interface
We also documented what went well and what can be improved in the use of AI.
You can have a look at the result here:
Future implementation can be related to performance improvement and stability of the analysis.
References
- CLAP: The main model being researched;
- huggingface: Pre-trained models for CLAP;
- Free Music Archive: Creative Commons songs that can be used for testing;
Liz - Prompt autocomplete by ftorchia
Description
Liz is the Rancher AI assistant for cluster operations.
Goals
We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.
Example:
- User prompt: "Can you show me the list of p"
- Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"
Example:
- User prompt: "Show me the logs of #rancher-"
- Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".
Technical Overview
- The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
- The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.
Resources
Improve/rework household chore tracker `chorazon` by gniebler
Description
I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.
It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.
There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)
Days are not rolled over automatically, to allow for task completion control.
We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.
It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.
Goals
- Add admin screens for users, tasks and schedules
- Add models, pages etc. to allow redeeming tokens for gifts/surprises
- …?
Resources
tbd (Gitlab repo)
Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil
Join the Gitter channel! https://gitter.im/uyuni-project/hackweek
Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!
Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.
For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.
No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)
The idea is testing Salt (including bootstrapping with bootstrap script) and Salt-ssh clients
To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):
- Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
- Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
- Package management (install, remove, update...)
- Patching
- Applying any basic salt state (including a formula)
- Salt remote commands
- Bonus point: Java part for product identification, and monitoring enablement
- Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
- Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
- Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)
If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)
- If you don't have knowledge about some of the steps: ask the team
- If you still don't know what to do: switch to another distribution and keep testing.
This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)
In progress/done for Hack Week 25
Guide
We started writin a Guide: Adding a new client GNU Linux distribution to Uyuni at https://github.com/uyuni-project/uyuni/wiki/Guide:-Adding-a-new-client-GNU-Linux-distribution-to-Uyuni, to make things easier for everyone, specially those not too familiar wht Uyuni or not technical.
openSUSE Leap 16.0
The distribution will all love!
https://en.opensuse.org/openSUSE:Roadmap#DRAFTScheduleforLeap16.0
Curent Status We started last year, it's complete now for Hack Week 25! :-D
[W]Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file) NOTE: Done, client tools for SLMicro6 are using as those for SLE16.0/openSUSE Leap 16.0 are not available yet[W]Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)[W]Package management (install, remove, update...). Works, even reboot requirement detection
Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0
Self-Scaling LLM Infrastructure Powered by Rancher

Description
The Problem
Running LLMs can get expensive and complex pretty quickly.
Today there are typically two choices:
- Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
- Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.
What if there was a middle ground?
What if infrastructure scaled itself instead of making you scale it?
Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?
Project Repository: github.com/alexander-demicev/llmserverless
What This Project Does
A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.
A complete, self-scaling LLM infrastructure that:
- Scales to zero when idle (no idle costs)
- Scales up automatically when requests come in
- Adds more nodes when needed, removes them when demand drops
- Runs on any infrastructure - laptop, bare metal, or cloud
Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.
How It Works
A combination of open source tools working together:
Flow:
- Users interact with OpenWebUI (chat interface)
- Requests go to LiteLLM Gateway
- LiteLLM routes requests to:
- Ollama (Knative) for local model inference (auto-scales pods)
- Or cloud APIs for fallback
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Golang MCP server with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
Results
Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.
Example execution
Multi-agent AI assistant for Linux troubleshooting by doreilly
Description
Explore multi-agent architecture as a way to avoid MCP context rot.
Having one agent with many tools bloats the context with low-level details about tool descriptions, parameter schemas etc which hurts LLM performance. Instead have many specialised agents, each with just the tools it needs for its role. A top level supervisor agent takes the user prompt and delegates to appropriate sub-agents.
Goals
Create an AI assistant with some sub-agents that are specialists at troubleshooting Linux subsystems, e.g. systemd, selinux, firewalld etc. The agents can get information from the system by implementing their own tools with simple function calls, or use tools from MCP servers, e.g. a systemd-agent can use tools from systemd-mcp.
Example prompts/responses:
user$ the system seems slow
assistant$ process foo with pid 12345 is using 1000% cpu ...
user$ I can't connect to the apache webserver
assistant$ the firewall is blocking http ... you can open the port with firewall-cmd --add-port ...
Resources
Language Python. The Python ADK is more mature than Golang.
https://google.github.io/adk-docs/
https://github.com/djoreilly/linux-helper
Try AI training with ROCm and LoRA by bmwiedemann
Description
I want to setup a Radeon RX 9600 XT 16 GB at home with ROCm on Slowroll.
Goals
I want to test how fast AI inference can get with the GPU and if I can use LoRA to re-train an existing free model for some task.
Resources
- https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html
- https://build.opensuse.org/project/show/science:GPU:ROCm
- https://src.opensuse.org/ROCm/
- https://www.suse.com/c/lora-fine-tuning-llms-for-text-classification/
Results
got inference working with llama.cpp:
export LLAMACPP_ROCM_ARCH=gfx1200
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=$LLAMACPP_ROCM_ARCH \
-DCMAKE_BUILD_TYPE=Release -DLLAMA_CURL=ON \
-Dhipblas_DIR=/usr/lib64/cmake/hipblaslt/ \
&& cmake --build build --config Release -j8
m=models/gpt-oss-20b-mxfp4.gguf
cd $P/llama.cpp && build/bin/llama-server --model $m --threads 8 --port 8005 --host 0.0.0.0 --device ROCm0 --n-gpu-layers 999
Without the --device option it faulted. Maybe because my APU also appears there?
I updated/fixed various related packages: https://src.opensuse.org/ROCm/rocm-examples/pulls/1 https://src.opensuse.org/ROCm/hipblaslt/pulls/1 SR 1320959
benchmark
I benchmarked inference with llama.cpp + gpt-oss-20b-mxfp4.gguf and ROCm offloading to a Radeon RX 9060 XT 16GB. I varied the number of layers that went to the GPU:
- 0 layers 14.49 tokens/s (8 CPU cores)
- 9 layers 17.79 tokens/s 34% VRAM
- 15 layers 22.39 tokens/s 51% VRAM
- 20 layers 27.49 tokens/s 64% VRAM
- 24 layers 41.18 tokens/s 74% VRAM
- 25+ layers 86.63 tokens/s 75% VRAM (only 200% CPU load)
So there is a significant performance-boost if the whole model fits into the GPU's VRAM.
Is SUSE Trending? Popularity and Developer Sentiment Insight Using Native AI Capabilities by terezacerna
Description
This project aims to explore the popularity and developer sentiment around SUSE and its technologies compared to Red Hat and their technologies. Using publicly available data sources, I will analyze search trends, developer preferences, repository activity, and media presence. The final outcome will be an interactive Power BI dashboard that provides insights into how SUSE is perceived and discussed across the web and among developers.
Goals
- Assess the popularity of SUSE products and brand compared to Red Hat using Google Trends.
- Analyze developer satisfaction and usage trends from the Stack Overflow Developer Survey.
- Use the GitHub API to compare SUSE and Red Hat repositories in terms of stars, forks, contributors, and issue activity.
- Perform sentiment analysis on GitHub issue comments to measure community tone and engagement using built-in Copilot capabilities.
- Perform sentiment analysis on Reddit comments related to SUSE technologies using built-in Copilot capabilities.
- Use Gnews.io to track and compare the volume of news articles mentioning SUSE and Red Hat technologies.
- Test the integration of Copilot (AI) within Power BI for enhanced data analysis and visualization.
- Deliver a comprehensive Power BI report summarizing findings and insights.
- Test the full potential of Power BI, including its AI features and native language Q&A.
Resources
- Google Trends: Web scraping for search popularity data
- Stack Overflow Developer Survey: For technology popularity and satisfaction comparison
- GitHub API: For repository data (stars, forks, contributors, issues, comments).
- Gnews.io API: For article volume and mentions analysis.
- Reddit: SUSE related topics with comments.