Description
Large language models like ChatGPT have demonstrated remarkable capabilities across a variety of applications. However, their potential for enhancing the Linux development and user ecosystem remains largely unexplored. This project seeks to bridge that gap by researching practical applications of LLMs to improve workflows in areas such as backporting, packaging, log analysis, system migration, and more. By identifying patterns that LLMs can leverage, we aim to uncover new efficiencies and automation strategies that can benefit developers, maintainers, and end users alike.
Goals
- Evaluate Existing LLM Capabilities: Research and document the current state of LLM usage in open-source and Linux development projects, noting successes and limitations.
- Prototype Tools and Scripts: Develop proof-of-concept scripts or tools that leverage LLMs to perform specific tasks like automated log analysis, assisting with backporting patches, or generating packaging metadata.
- Assess Performance and Reliability: Test the tools' effectiveness on real-world Linux data and analyze their accuracy, speed, and reliability.
- Identify Best Use Cases: Pinpoint which tasks are most suitable for LLM support, distinguishing between high-impact and impractical applications.
- Document Findings and Recommendations: Summarize results with clear documentation and suggest next steps for potential integration or further development.
Resources
- Local LLM Implementations: Access to locally hosted LLMs such as LLaMA, GPT-J, or similar open-source models that can be run and fine-tuned on local hardware.
- Computing Resources: Workstations or servers capable of running LLMs locally, equipped with sufficient GPU power for training and inference.
- Sample Data: Logs, source code, patches, and packaging data from openSUSE or SUSE repositories for model training and testing.
- Public LLMs for Benchmarking: Access to APIs from platforms like OpenAI or Hugging Face for comparative testing and performance assessment.
- Existing NLP Tools: Libraries such as spaCy, Hugging Face Transformers, and PyTorch for building and interacting with local LLMs.
- Technical Documentation: Tutorials and resources focused on setting up and optimizing local LLMs for tasks relevant to Linux development.
- Collaboration: Engagement with community experts and teams experienced in AI and Linux for feedback and joint exploration.
Looking for hackers with the skills:
This project is part of:
Hack Week 24
Activity
Comments
-
-
about 1 year ago by jiriwiesner | Reply
I would like to ask an LLM instance about the inner workings on the Linux kernel code. It is a common task of mine to look for a bug in a subsystem or a layer that can easily have tens of thousands of lines of code (e.g. bsc 1216813). I know having an understanding of the Linux code is what we do as developers but my understanding and knowledge is always limited because I simply do not have the time to read all of the code possibly involved in an issue. If the LLM was trained to process the source code of a specific version of Linux a developer could then ask involved questions about the code using the terms found in the code base. It should basically be something that allows a developer find the interesting parts of the code better than when using just grep.
-
about 1 year ago by anicka | Reply
Actually, it looks like that off-the-shelf ChatGPT 4 can be already quite helpful in such tasks.
But training something like code llama on our kernels is something I indeed want to look into next time because if there is any way how to leverage LLMs in our bugfixing or backporting, this is it.
-
Similar Projects
SUSE Observability MCP server by drutigliano
Description
The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.
This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.
Goals
- Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
- Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
- Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
- Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.
Hackweek STEP
- Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.
Scope
- Implement read-only MCP server that can:
- Connect to a live SUSE Observability instance and authenticate (with API token)
- Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
- Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
- Return the data as a structured JSON payload compliant with the MCP specification.
Deliverables
- MCP Server v0.1 A running Golang MCP server with at least one tool.
- A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.
Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.
Resources
- https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
- https://www.datadoghq.com/blog/datadog-remote-mcp-server
- https://modelcontextprotocol.io/specification/2025-06-18/index
- https://modelcontextprotocol.io/docs/develop/build-server
Basic implementation
- https://github.com/drutigliano19/suse-observability-mcp-server
Results
Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.
Example execution
Background Coding Agent by mmanno
Description
I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.
Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.
So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.
Can we make prompts and workflows smart enough to succeed at background coding?
Goals
We will build a platform that allows engineers to execute complex code transformations using prompts.
By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.
Our platform will consist of three main components:
- "Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
- The target repositories.
- A wrapper to run a "coding agent", e.g., "gemini-cli".
- The task as a natural language prompt.
- The target repositories.
- "Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
- Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.
MVP
- Define the Change manifest format.
- Build the core Management Service that can accept and queue a Change.
- Connect management service and runners, dynamically dispatch jobs to runners.
- Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.
Stretch Goals:
- Multi-layered approach, Workflow Agents trigger Coding Agents:
- Workflow Agent: Gather information about the task interactively from the user.
- Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
- Workflow Agent: Gather information about the task interactively from the user.
- Use MCP:
- Workflow Agent gathers context information from Slack, Github, etc.
- Workflow Agent triggers a Coding Agent.
- Workflow Agent gathers context information from Slack, Github, etc.
- Create a "Standard Task" library with reliable prompts.
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Update charts to use new images
- Apply changes to comply with a new linter
- Bump complex Go dependencies, like k8s modules
- Backport pull requests to other branches
- Rebasing rancher-monitoring to a new version of kube-prom-stack
- Add “review agents” that review the generated PR.
See also
AI-Powered Unit Test Automation for Agama by joseivanlopez
The Agama project is a multi-language Linux installer that leverages the distinct strengths of several key technologies:
- Rust: Used for the back-end services and the core HTTP API, providing performance and safety.
- TypeScript (React/PatternFly): Powers the modern web user interface (UI), ensuring a consistent and responsive user experience.
- Ruby: Integrates existing, robust YaST libraries (e.g.,
yast-storage-ng) to reuse established functionality.
The Problem: Testing Overhead
Developing and maintaining code across these three languages requires a significant, tedious effort in writing, reviewing, and updating unit tests for each component. This high cost of testing is a drain on developer resources and can slow down the project's evolution.
The Solution: AI-Driven Automation
This project aims to eliminate the manual overhead of unit testing by exploring and integrating AI-driven code generation tools. We will investigate how AI can:
- Automatically generate new unit tests as code is developed.
- Intelligently correct and update existing unit tests when the application code changes.
By automating this crucial but monotonous task, we can free developers to focus on feature implementation and significantly improve the speed and maintainability of the Agama codebase.
Goals
- Proof of Concept: Successfully integrate and demonstrate an authorized AI tool (e.g.,
gemini-cli) to automatically generate unit tests. - Workflow Integration: Define and document a new unit test automation workflow that seamlessly integrates the selected AI tool into the existing Agama development pipeline.
- Knowledge Sharing: Establish a set of best practices for using AI in code generation, sharing the learned expertise with the broader team.
Contribution & Resources
We are seeking contributors interested in AI-powered development and improving developer efficiency. Whether you have previous experience with code generation tools or are eager to learn, your participation is highly valuable.
If you want to dive deep into AI for software quality, please reach out and join the effort!
- Authorized AI Tools: Tools supported by SUSE (e.g.,
gemini-cli) - Focus Areas: Rust, TypeScript, and Ruby components within the Agama project.
Interesting Links
MCP Server for SCC by digitaltomm
Description
Provide an MCP Server implementation for customers to access data on scc.suse.com via MCP protocol. The core benefit of this MCP interface is that it has direct (read) access to customer data in SCC, so the AI agent gets enhanced knowledge about individual customer data, like subscriptions, orders and registered systems.
Architecture

Goals
We want to demonstrate a proof of concept to connect to the SCC MCP server with any AI agent, for example gemini-cli or codex. Enabling the user to ask questions regarding their SCC inventory.
For this Hackweek, we target that users get proper responses to these example questions:
- Which of my currently active systems are running products that are out of support?
- Do I have ready to use registration codes for SLES?
- What are the latest 5 released patches for SLES 15 SP6? Output as a list with release date, patch name, affected package names and fixed CVEs.
- Which versions of kernel-default are available on SLES 15 SP6?
Technical Notes
Similar to the organization APIs, this can expose to customers data about their subscriptions, orders, systems and products. Authentication should be done by organization credentials, similar to what needs to be provided to RMT/MLM. Customers can connect to the SCC MCP server from their own MCP-compatible client and Large Language Model (LLM), so no third party is involved.
Milestones
[x] Basic MCP API setup MCP endpoints [x] Products / Repositories [x] Subscriptions / Orders [x] Systems [x] Packages [x] Document usage with Gemini CLI, Codex
Resources
Gemini CLI setup:
~/.gemini/settings.json:
GenAI-Powered Systemic Bug Evaluation and Management Assistant by rtsvetkov
Motivation
What is the decision critical question which one can ask on a bug? How this question affects the decision on a bug and why?
Let's make GenAI look on the bug from the systemic point and evaluate what we don't know. Which piece of information is missing to take a decision?
Description
To build a tool that takes a raw bug report (including error messages and context) and uses a large language model (LLM) to generate a series of structured, Socratic-style or Systemic questions designed to guide a the integration and development toward the root cause, rather than just providing a direct, potentially incorrect fix.
Goals
Set up a Python environment
Set the environment and get a Gemini API key. 2. Collect 5-10 realistic bug reports (from open-source projects, personal projects, or public forums like Stack Overflow—include the error message and the initial context).
Build the Dialogue Loop
- Write a basic Python script using the Gemini API.
- Implement a simple conversational loop: User Input (Bug) -> AI Output (Question) -> User Input (Answer to AI's question) -> AI Output (Next Question). Code Implementation
Socratic/Systemic Strategy Implementation
- Refine the logic to ensure the questions follow a Socratic and Systemic path (e.g., from symptom-> context -> assumptions -> -> critical parts -> ).
- Implement Function Calling (an advanced feature of the Gemini API) to suggest specific actions to the user, like "Run a ping test" or "Check the database logs."
- Implement Bugzillla call to collect the
- Implement Questioning Framework as LLVM pre-conditioning
- Define set of instructions
- Assemble the Tool
Resources
What are Systemic Questions?
Systemic questions explore the relationships, patterns, and interactions within a system rather than focusing on isolated elements.
In IT, they help uncover hidden dependencies, feedback loops, assumptions, and side-effects during debugging or architecture analysis.
Gitlab Project
gitlab.suse.de/sle-prjmgr/BugDecisionCritical_Question