Description
This project is meant to fight the loneliness of the support team members, providing them an AI assistant (hopefully) capable of scraping supportconfigs in a RAG fashion, trying to answer specific questions.
Goals
- Setup an Ollama backend, spinning one (or more??) code-focused LLMs selected by license, performance and quality of the results between:
- deepseek-coder-v2
- dolphin-mistral
- starcoder2
- (...others??)
- Setup a Web UI for it, choosing an easily extensible and customizable option between:
- Extend the solution in order to be able to:
- Add ZIU/Concord shared folders to its RAG context
- Add BZ cases, splitted in comments to its RAG context
- A plus would be to login using the IDP portal to ghostwrAIter itself and use the same credentials to query BZ
- Add specific packages picking them from IBS repos
- A plus would be to login using the IDP portal to ghostwrAIter itself and use the same credentials to query IBS
- A plus would be to desume the packages of interest and the right channel and version to be picked from the added BZ cases
This project is part of:
Hack Week 24
Activity
Comments
-
about 1 year ago by paolodepa | Reply
The project soon moved to CLI, as the skills for integrating a WEB-UI are not my cup of tea :-/
Its description and source code can be found at ghostwrAIter
I tested the listed LLMs and also the following embedding models: mxbai-embed-large, nomic-embed-text, all-minilm.
My impression is that the current state of the art for the really open-source llms and embedding models is not still mature and ready for production grade and that a big gap exists with the most well-known commercial product.
Hopefully will run a refresh for the next hackweek.
Similar Projects
Background Coding Agent by mmanno
Description
I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.
Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.
So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.
Can we make prompts and workflows smart enough to succeed at background coding?
Goals
We will build a platform that allows engineers to execute complex code transformations using prompts.
By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.
Our platform will consist of three main components:
- "Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
- The target repositories.
- A wrapper to run a "coding agent", e.g., "gemini-cli".
- The task as a natural language prompt.
- The target repositories.
- "Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
- Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.
MVP
- Define the Change manifest format.
- Build the core Management Service that can accept and queue a Change.
- Connect management service and runners, dynamically dispatch jobs to runners.
- Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.
Stretch Goals:
- Multi-layered approach, Workflow Agents trigger Coding Agents:
- Workflow Agent: Gather information about the task interactively from the user.
- Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
- Workflow Agent: Gather information about the task interactively from the user.
- Use MCP:
- Workflow Agent gathers context information from Slack, Github, etc.
- Workflow Agent triggers a Coding Agent.
- Workflow Agent gathers context information from Slack, Github, etc.
- Create a "Standard Task" library with reliable prompts.
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Update charts to use new images
- Apply changes to comply with a new linter
- Bump complex Go dependencies, like k8s modules
- Backport pull requests to other branches
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Add “review agents” that review the generated PR.
See also
Extended private brain - RAG my own scripts and data into offline LLM AI by tjyrinki_suse
Description
For purely studying purposes, I'd like to find out if I could teach an LLM some of my own accumulated knowledge, to use it as a sort of extended brain.
I might use qwen3-coder or something similar as a starting point.
Everything would be done 100% offline without network available to the container, since I prefer to see when network is needed, and make it so it's never needed (other than initial downloads).
Goals
- Learn something about RAG, LLM, AI.
- Find out if everything works offline as intended.
- As an end result have a new way to access my own existing know-how, but so that I can query the wisdom in them.
- Be flexible to pivot in any direction, as long as there are new things learned.
Resources
To be found on the fly.
Timeline
Day 1 (of 4)
- Tried out a RAG demo, expanded on feeding it my own data
- Experimented with qwen3-coder to add a persistent chat functionality, and keeping vectors in a pickle file
- Optimizations to keep everything within context window
- Learn and add a bit of PyTest
Day 2
- More experimenting and more data
- Study ChromaDB
- Add a Web UI that works from another computer even though the container sees network is down
Day 3
- The above RAG is working well enough for demonstration purposes.
- Pivot to trying out OpenCode, configuring local Ollama qwen3-coder there, to analyze the RAG demo.
- Figured out how to configure Ollama template to be usable under OpenCode. OpenCode locally is super slow to just running qwen3-coder alone.
Day 4 (final day)
- Battle with OpenCode that was both slow and kept on piling up broken things.
- Call it success as after all the agentic AI was working locally.
- Clean up the mess left behind a bit.
Blog Post
Summarized the findings at blog post.
Local AI assistant with optional integrations and mobile companion by livdywan
Description
Setup a local AI assistant for research, brainstorming and proof reading. Look into SurfSense, Open WebUI and possibly alternatives. Explore integration with services like openQA. There should be no cloud dependencies. Mobile phone support or an additional companion app would be a bonus. The goal is not to develop everything from scratch.
User Story
- Allison Average wants a one-click local AI assistent on their openSUSE laptop.
- Ash Awesome wants AI on their phone without an expensive subscription.
Goals
- Evaluate a local SurfSense setup for day to day productivity
- Test opencode for vibe coding and tool calling
Timeline
Day 1
- Took a look at SurfSense and started setting up a local instance.
- Unfortunately the container setup did not work well. Tho this was a great opportunity to learn some new podman commands and refresh my memory on how to recover a corrupted btrfs filesystem.
Day 2
- Due to its sheer size and complexity SurfSense seems to have triggered btrfs fragmentation. Naturally this was not visible in any podman-related errors or in the journal. So this took up much of my second day.
Day 3
- Trying out opencode with Qwen3-Coder and Qwen2.5-Coder.
Day 4
- Context size is a thing, and models are not equally usable for vibe coding.
- Through arduous browsing for ollama models I did find some like
myaniu/qwen2.5-1m:7bwith 1m but even then it is not obvious if they are meant for tool calls.
Day 5
- Whilst trying to make opencode usable I discovered ramalama which worked instantly and very well.
Outcomes
surfsense
I could not easily set this up completely. Maybe in part due to my filesystem issues. Was expecting this to be less of an effort.
opencode
Installing opencode and ollama in my distrobox container along with the following configs worked for me.
When preparing a new project from scratch it is a good idea to start out with a template.
opencode.json
``` {
Flaky Tests AI Finder for Uyuni and MLM Test Suites by oscar-barrios
Description
Our current Grafana dashboards provide a great overview of test suite health, including a panel for "Top failed tests." However, identifying which of these failures are due to legitimate bugs versus intermittent "flaky tests" is a manual, time-consuming process. These flaky tests erode trust in our test suites and slow down development.
This project aims to build a simple but powerful Python script that automates flaky test detection. The script will directly query our Prometheus instance for the historical data of each failed test, using the jenkins_build_test_case_failure_age metric. It will then format this data and send it to the Gemini API with a carefully crafted prompt, asking it to identify which tests show a flaky pattern.
The final output will be a clean JSON list of the most probable flaky tests, which can then be used to populate a new "Top Flaky Tests" panel in our existing Grafana test suite dashboard.
Goals
By the end of Hack Week, we aim to have a single, working Python script that:
- Connects to Prometheus and executes a query to fetch detailed test failure history.
- Processes the raw data into a format suitable for the Gemini API.
- Successfully calls the Gemini API with the data and a clear prompt.
- Parses the AI's response to extract a simple list of flaky tests.
- Saves the list to a JSON file that can be displayed in Grafana.
- New panel in our Dashboard listing the Flaky tests
Resources
- Jenkins Prometheus Exporter: https://github.com/uyuni-project/jenkins-exporter/
- Data Source: Our internal Prometheus server.
- Key Metric:
jenkins_build_test_case_failure_age{jobname, buildid, suite, case, status, failedsince}. - Existing Query for Reference:
count by (suite) (max_over_time(jenkins_build_test_case_failure_age{status=~"FAILED|REGRESSION", jobname="$jobname"}[$__range])). - AI Model: The Google Gemini API.
- Example about how to interact with Gemini API: https://github.com/srbarrios/FailTale/
- Visualization: Our internal Grafana Dashboard.
- Internal IaC: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring
Outcome
- Jenkins Flaky Test Detector: https://github.com/srbarrios/jenkins-flaky-tests-detector and its container
- IaC on MLM Team: https://gitlab.suse.de/galaxy/infrastructure/-/tree/master/srv/salt/monitoring/jenkinsflakytestsdetector?reftype=heads, https://gitlab.suse.de/galaxy/infrastructure/-/blob/master/srv/salt/monitoring/grafana/dashboards/flaky-tests.json?ref_type=heads, and others.
- Grafana Dashboard: https://grafana.mgr.suse.de/d/flaky-tests/flaky-tests-detection @ @ text
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Golang MCP server with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
Results
Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.
Example execution