Description
I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.
Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.
So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.
Can we make prompts and workflows smart enough to succeed at background coding?
Goals
We will build a platform that allows engineers to execute complex code transformations using prompts.
By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.
Our platform will consist of three main components:
- "Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
- The target repositories.
- A wrapper to run a "coding agent", e.g., "gemini-cli".
- The task as a natural language prompt.
- The target repositories.
- "Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
- Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.
MVP
- Define the Change manifest format.
- Build the core Management Service that can accept and queue a Change.
- Connect management service and runners, dynamically dispatch jobs to runners.
- Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.
Stretch Goals:
- Multi-layered approach, Workflow Agents trigger Coding Agents:
- Workflow Agent: Gather information about the task interactively from the user.
- Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
- Workflow Agent: Gather information about the task interactively from the user.
- Use MCP:
- Workflow Agent gathers context information from Slack, Github, etc.
- Workflow Agent triggers a Coding Agent.
- Workflow Agent gathers context information from Slack, Github, etc.
- Create a "Standard Task" library with reliable prompts.
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Update charts to use new images
- Apply changes to comply with a new linter
- Bump complex Go dependencies, like k8s modules
- Backport pull requests to other branches
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Add “review agents” that review the generated PR.
See also
- https://engineering.atspotify.com/2025/11/spotifys-background-coding-agent-part-1
- https://agentspace.cc/
- https://ampcode.com/how-to-build-an-agent
- …
Resources
- Hosting for runners
- License for agents
This project is part of:
Hack Week 25
Activity
Comments
Be the first to comment!
Similar Projects
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Python web server (e.g., using FastAPI) with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
Bugzilla goes AI - Phase 1 by nwalter
Description
This project, Bugzilla goes AI, aims to boost developer productivity by creating an autonomous AI bug agent during Hackweek. The primary goal is to reduce the time employees spend triaging bugs by integrating Ollama to summarize issues, recommend next steps, and push focused daily reports to a Web Interface.
Goals
To reduce employee time spent on Bugzilla by implementing an AI tool that triages and summarizes bug reports, providing actionable recommendations to the team via Web Interface.
Project Charter
https://docs.google.com/document/d/1HbAvgrg8T3pd1FIx74nEfCObCljpO77zz5In_Jpw4as/edit?usp=sharing## Description
Multi-agent AI assistant for Linux troubleshooting by doreilly
Description
Explore multi-agent architecture as a way to avoid MCP context rot.
Having one agent with many tools bloats the context with low-level details about tool descriptions, parameter schemas etc which hurts LLM performance. Instead have many specialised agents, each with just the tools it needs for its role. A top level supervisor agent takes the user prompt and delegates to appropriate sub-agents.
Goals
Create an AI assistant with some sub-agents that are specialists at troubleshooting Linux subsystems, e.g. systemd, selinux, firewalld etc. The agents can get information from the system by implementing their own tools with simple function calls, or use tools from MCP servers, e.g. a systemd-agent can use tools from systemd-mcp.
Example prompts/responses:
user$ the system seems slow
assistant$ process foo with pid 12345 is using 1000% cpu ...
user$ I can't connect to the apache webserver
assistant$ the firewall is blocking http ... you can open the port with firewall-cmd --add-port ...
Resources
Language TBD - golang or python. Python ADK seems more mature, but golang is easier to package.
https://google.github.io/adk-docs/
Gemini-Powered Socratic Bug Evaluation and Management Assistant by rtsvetkov
Description
To build a tool or system that takes a raw bug report (including error messages and context) and uses a large language model (LLM) to generate a series of structured, Socratic-style questions designed to guide a the integration and development toward the root cause, rather than just providing a direct, potentially incorrect fix.
Goals
Set up a Python environment
Set the environment and get a Gemini API key. 2. Collect 5-10 realistic bug reports (from open-source projects, personal projects, or public forums like Stack Overflow—include the error message and the initial context).
Build the Dialogue Loop
- Write a basic Python script using the Gemini API.
- Implement a simple conversational loop: User Input (Bug) -> AI Output (Question) -> User Input (Answer to AI's question) -> AI Output (Next Question). Code Implementation
Socratic Strategy Implementation
- Refine the logic to ensure the questions follow a Socratic path (e.g., from symptom-> context -> assumptions -> root cause).
- Implement Function Calling (an advanced feature of the Gemini API) to suggest specific actions to the user, like "Run a ping test" or "Check the database logs."
Resources
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Python web server (e.g., using FastAPI) with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
"what is it" file and directory analysis via MCP and local LLM, for console and KDE by rsimai
Description
Users sometimes wonder what files or directories they find on their local PC are good for. If they can't determine from the filename or metadata, there should an easy way to quickly analyze the content and at least guess the meaning. An LLM could help with that, through the use of a filesystem MCP and to-text-converters for typical file types. Ideally this is integrated into the desktop environment but works as well from a console. All data is processed locally or "on premise", no artifacts remain or leave the system.
Goals
- The user can run a command from the console, to check on a file or directory
- The filemanager contains the "analyze" feature within the context menu
- The local LLM could serve for other use cases where privacy matters
TBD
- Find or write capable one-shot and interactive MCP client
- Create local LLM service with appropriate footprint, containerized
- Shell command with options
- KDE integration (Dolphin)
- Package
- Document
Resources