SUSE Hack Week: Background Coding Agent

Description

I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.

Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.

So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.

Can we make prompts and workflows smart enough to succeed at background coding?

Goals

We will build a platform that allows engineers to execute complex code transformations using prompts.

By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.

Our platform will consist of three main components:

"Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
- The target repositories.
- A wrapper to run a "coding agent", e.g., "gemini-cli".
- The task as a natural language prompt.
"Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.

MVP

Define the Change manifest format.
Build the core Management Service that can accept and queue a Change.
Connect management service and runners, dynamically dispatch jobs to runners.
Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.

Stretch Goals:

Multi-layered approach, Workflow Agents trigger Coding Agents:
1. Workflow Agent: Gather information about the task interactively from the user.
2. Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
Use MCP:
1. Workflow Agent gathers context information from Slack, Github, etc.
2. Workflow Agent triggers a Coding Agent.
Create a "Standard Task" library with reliable prompts.
1. Rebasing rancher-monitoring to a new version of kube-prom-stack
2. Update charts to use new images
3. Apply changes to comply with a new linter
4. Bump complex Go dependencies, like k8s modules
5. Backport pull requests to other branches
Add “review agents” that review the generated PR.

Resources

Hosting for runners
License for agents

Join this project Leave this project

Looking for hackers with the skills:

agents workflow ai

This project is part of:

Hack Week 25

Activity

2 months ago: pgonin liked this project.

2 months ago: mmanno added keyword "agents" to this project.

2 months ago: mmanno added keyword "workflow" to this project.

2 months ago: mmanno added keyword "ai" to this project.

2 months ago: mmanno started this project.

2 months ago: mmanno originated this project.

Comments

about 2 months ago by mmanno | Reply

Created Background Automated Coding Agent, a declarative, prompt-driven code transformation platform:
- https://github.com/manno/baca
  - potential workflow agent to hydrate prompts from mcp
  - GHA backend
  - https://github.com/manno-test/demo-app
  - https://github.com/manno-test/demo-helm-charts
And researched API aggregation a bit:
- https://github.com/manno/fleet/pull/213

Similar Projects

agents

SUSE Observability MCP server by drutigliano

Description

The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

Goals

Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

Hackweek STEP

Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

Scope

Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.

Deliverables

MCP Server v0.1 A running Golang MCP server with at least one tool.
A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

Resources

https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
https://www.datadoghq.com/blog/datadog-remote-mcp-server
https://modelcontextprotocol.io/specification/2025-06-18/index
https://modelcontextprotocol.io/docs/develop/build-server

Basic implementation

https://github.com/drutigliano19/suse-observability-mcp-server

Results

Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.

Example execution

ai

"what is it" file and directory analysis via MCP and local LLM, for console and KDE by rsimai

Description

Users sometimes wonder what files or directories they find on their local PC are good for. If they can't determine from the filename or metadata, there should an easy way to quickly analyze the content and at least guess the meaning. An LLM could help with that, through the use of a filesystem MCP and to-text-converters for typical file types. Ideally this is integrated into the desktop environment but works as well from a console. All data is processed locally or "on premise", no artifacts remain or leave the system.

Goals

The user can run a command from the console, to check on a file or directory
The filemanager contains the "analyze" feature within the context menu
The local LLM could serve for other use cases where privacy matters

TBD

Find or write capable one-shot and interactive MCP client
Find or write simple+secure file access MCP server
Create local LLM service with appropriate footprint, containerized
Shell command with options
KDE integration (Dolphin)
Package
Document

Resources

GenAI-Powered Systemic Bug Evaluation and Management Assistant by rtsvetkov

Motivation

What is the decision critical question which one can ask on a bug? How this question affects the decision on a bug and why?

Let's make GenAI look on the bug from the systemic point and evaluate what we don't know. Which piece of information is missing to take a decision?

Description

To build a tool that takes a raw bug report (including error messages and context) and uses a large language model (LLM) to generate a series of structured, Socratic-style or Systemic questions designed to guide a the integration and development toward the root cause, rather than just providing a direct, potentially incorrect fix.

Goals

Set up a Python environment

Set the environment and get a Gemini API key. 2. Collect 5-10 realistic bug reports (from open-source projects, personal projects, or public forums like Stack Overflow—include the error message and the initial context).

Build the Dialogue Loop

Write a basic Python script using the Gemini API.
Implement a simple conversational loop: User Input (Bug) -> AI Output (Question) -> User Input (Answer to AI's question) -> AI Output (Next Question). Code Implementation

Socratic/Systemic Strategy Implementation

Refine the logic to ensure the questions follow a Socratic and Systemic path (e.g., from symptom-> context -> assumptions -> -> critical parts -> ).
Implement Function Calling (an advanced feature of the Gemini API) to suggest specific actions to the user, like "Run a ping test" or "Check the database logs."
Implement Bugzillla call to collect the
Implement Questioning Framework as LLVM pre-conditioning
Define set of instructions
Assemble the Tool

Resources

What are Systemic Questions?

Systemic questions explore the relationships, patterns, and interactions within a system rather than focusing on isolated elements.
In IT, they help uncover hidden dependencies, feedback loops, assumptions, and side-effects during debugging or architecture analysis.

Gitlab Project

gitlab.suse.de/sle-prjmgr/BugDecisionCritical_Question

Bugzilla goes AI - Phase 1 by nwalter

Description

This project, Bugzilla goes AI, aims to boost developer productivity by creating an autonomous AI bug agent during Hackweek. The primary goal is to reduce the time employees spend triaging bugs by integrating Ollama to summarize issues, recommend next steps, and push focused daily reports to a Web Interface.

Goals

To reduce employee time spent on Bugzilla by implementing an AI tool that triages and summarizes bug reports, providing actionable recommendations to the team via Web Interface.

Project Charter

Bugzilla goes AI Phase 1

Description

Project Achievements during Hackweek

In this file you can read about what we achieved during Hackweek.

Project Achievements

Try AI training with ROCm and LoRA by bmwiedemann

Description

I want to setup a Radeon RX 9600 XT 16 GB at home with ROCm on Slowroll.

Goals

I want to test how fast AI inference can get with the GPU and if I can use LoRA to re-train an existing free model for some task.

Resources

https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html
https://build.opensuse.org/project/show/science:GPU:ROCm
https://src.opensuse.org/ROCm/
https://www.suse.com/c/lora-fine-tuning-llms-for-text-classification/

Results

got inference working with llama.cpp:

export LLAMACPP_ROCM_ARCH=gfx1200
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=$LLAMACPP_ROCM_ARCH \
-DCMAKE_BUILD_TYPE=Release -DLLAMA_CURL=ON \
-Dhipblas_DIR=/usr/lib64/cmake/hipblaslt/ \
&amp;&amp; cmake --build build --config Release -j8
m=models/gpt-oss-20b-mxfp4.gguf
cd $P/llama.cpp &amp;&amp; build/bin/llama-server --model $m --threads 8 --port 8005 --host 0.0.0.0 --device ROCm0 --n-gpu-layers 999

Without the --device option it faulted. Maybe because my APU also appears there?

I updated/fixed various related packages: https://src.opensuse.org/ROCm/rocm-examples/pulls/1 https://src.opensuse.org/ROCm/hipblaslt/pulls/1 SR 1320959

benchmark

I benchmarked inference with llama.cpp + gpt-oss-20b-mxfp4.gguf and ROCm offloading to a Radeon RX 9060 XT 16GB. I varied the number of layers that went to the GPU:

0 layers 14.49 tokens/s (8 CPU cores)
9 layers 17.79 tokens/s 34% VRAM
15 layers 22.39 tokens/s 51% VRAM
20 layers 27.49 tokens/s 64% VRAM
24 layers 41.18 tokens/s 74% VRAM
25+ layers 86.63 tokens/s 75% VRAM (only 200% CPU load)

So there is a significant performance-boost if the whole model fits into the GPU's VRAM.

AI-Powered Unit Test Automation for Agama by joseivanlopez

The Agama project is a multi-language Linux installer that leverages the distinct strengths of several key technologies:

Rust: Used for the back-end services and the core HTTP API, providing performance and safety.
TypeScript (React/PatternFly): Powers the modern web user interface (UI), ensuring a consistent and responsive user experience.
Ruby: Integrates existing, robust YaST libraries (e.g., yast-storage-ng) to reuse established functionality.

The Problem: Testing Overhead

Developing and maintaining code across these three languages requires a significant, tedious effort in writing, reviewing, and updating unit tests for each component. This high cost of testing is a drain on developer resources and can slow down the project's evolution.

The Solution: AI-Driven Automation

This project aims to eliminate the manual overhead of unit testing by exploring and integrating AI-driven code generation tools. We will investigate how AI can:

Automatically generate new unit tests as code is developed.
Intelligently correct and update existing unit tests when the application code changes.

By automating this crucial but monotonous task, we can free developers to focus on feature implementation and significantly improve the speed and maintainability of the Agama codebase.

Goals

Proof of Concept: Successfully integrate and demonstrate an authorized AI tool (e.g., gemini-cli) to automatically generate unit tests.
Workflow Integration: Define and document a new unit test automation workflow that seamlessly integrates the selected AI tool into the existing Agama development pipeline.
Knowledge Sharing: Establish a set of best practices for using AI in code generation, sharing the learned expertise with the broader team.

Contribution & Resources

We are seeking contributors interested in AI-powered development and improving developer efficiency. Whether you have previous experience with code generation tools or are eager to learn, your participation is highly valuable.

If you want to dive deep into AI for software quality, please reach out and join the effort!

Authorized AI Tools: Tools supported by SUSE (e.g., gemini-cli)
Focus Areas: Rust, TypeScript, and Ruby components within the Agama project.

Interesting Links

goose

Description

Goals

MVP

Stretch Goals:

See also

Resources

Looking for hackers with the skills:

This project is part of:

Activity

Comments

about 2 months ago by mmanno | Reply

Similar Projects

agents

SUSE Observability MCP server by drutigliano

Description

Goals

Hackweek STEP

Scope

Deliverables

Resources

Basic implementation

Results

Example execution

ai

"what is it" file and directory analysis via MCP and local LLM, for console and KDE by rsimai

Description

Goals

TBD

Resources

GenAI-Powered Systemic Bug Evaluation and Management Assistant by rtsvetkov

Motivation

Description

Goals

Set up a Python environment

Build the Dialogue Loop

Socratic/Systemic Strategy Implementation

Resources

Gitlab Project

Bugzilla goes AI - Phase 1 by nwalter

Description

Goals

Project Charter

Description

Project Achievements during Hackweek

Try AI training with ROCm and LoRA by bmwiedemann

Description

Goals

Resources

Results

benchmark

AI-Powered Unit Test Automation for Agama by joseivanlopez

The Problem: Testing Overhead

The Solution: AI-Driven Automation

Goals

Contribution & Resources

Interesting Links