Description

Backporting Linux kernel fixes (either for CVE issues or as part of general git-fixes workflow) is boring and mostly mechanical work (dealing with changes in context, renamed variables, new helper functions etc.). The idea of this project is to explore usage of LLM for backporting Linux kernel commits to SUSE kernels using LLM.

Goals

  • Create safe environment allowing LLM to run and backport patches without exposing the whole filesystem to it (for privacy and security reasons).
  • Write prompt that will guide LLM through the backporting process. Fine tune it based on experimental results.
  • Explore success rate of LLMs when backporting various patches.

Resources

  • Docker
  • Gemini CLI

Repository

Current version of the container with some instructions for use are at: https://gitlab.suse.de/jankara/gemini-cli-backporter

Looking for hackers with the skills:

ai llm kernel

This project is part of:

Hack Week 25

Activity

  • about 4 hours ago: jankara added keyword "ai" to this project.
  • about 4 hours ago: jankara added keyword "llm" to this project.
  • about 4 hours ago: jankara added keyword "kernel" to this project.
  • 1 day ago: jankara started this project.
  • 1 day ago: jankara originated this project.

  • Comments

    • jankara
      about 1 hour ago by jankara | Reply

      I've created Docker container for this purpose and some scripting around it to save some typing (see the repository for details). Gemini successfully backported upstream commit 8ecb790ea8c3 ("ext4: avoid potential buffer over-read in parseapplysbmountoptions()") to SLE12-SP5 without any manual intervention. That is a good success because the commit requires applying changes to different place (and different function - the code has been significantly refactored due to mount API conversion). Backport of follow up commit ee5a977b4e ("ext4: fix string copying in parseapplysbmountoptions()") required hint to Gemini that 8ecb790ea8c3 was already backported. I have updated prompt to instruct Gemini how to search for already applied commits to possibly figure this out itself. I didn't have time to test that. Backport of commit 1d3ad18394 ("ext4: detect invalid INLINE_DATA + EXTENTS flag combination") was smooth as well but there the problem was just slightly modified context.

      On the other hand backport of commit b86433721f4 ("blk-mq: fix potential deadlock while nr_requests grown") to SL-16.0 was too much. Gemini managed to create a patch that would apply (and likely compile) but the adaptation had multiple functional issues. The problem here was that the commit was part of a larger sequence of fixes to this area that significantly refactored the code and data structures. Also I've run out of credits for the PRO model during backporting so part of the backport was done by FLASH model which is apparently not clever enough.

      To summarize the prompt certainly needs more work to better handle situations where more commits need backporting (and more thought what Gemini should do in that case - when it should decide the backport is just too complex and bail out?). Also PRO model seems to be significantly better than FLASH model but backport of one or two patches is enough to burn all your free credits for the day which somewhat slows down experimentation.

      Another research direction is trying Claude Code which has CLI client as well and is deemed to be more advanced in coding tasks than Gemini. However a quick research seems to indicate that for Claude CLI access one needs a paid version so experiments require some investment...

    Similar Projects

    Uyuni Health-check Grafana AI Troubleshooter by ygutierrez

    Description

    This project explores the feasibility of using the open-source Grafana LLM plugin to enhance the Uyuni Health-check tool with LLM capabilities. The idea is to integrate a chat-based "AI Troubleshooter" directly into existing dashboards, allowing users to ask natural-language questions about errors, anomalies, or performance issues.

    Goals

    • Investigate if and how the grafana-llm-app plug-in can be used within the Uyuni Health-check tool.
    • Investigate if this plug-in can be used to query LLMs for troubleshooting scenarios.
    • Evaluate support for local LLMs and external APIs through the plugin.
    • Evaluate if and how the Uyuni MCP server could be integrated as another source of information.

    Resources

    Grafana LMM plug-in

    Uyuni Health-check


    MCP Trace Suite by r1chard-lyu

    Description

    This project plans to create an MCP Trace Suite, a system that consolidates commonly used Linux debugging tools such as bpftrace, perf, and ftrace.

    The suite is implemented as an MCP Server. This architecture allows an AI agent to leverage the server to diagnose Linux issues and perform targeted system debugging by remotely executing and retrieving tracing data from these powerful tools.

    • Repo: https://github.com/r1chard-lyu/systracesuite
    • Demo: Slides

    Goals

    1. Build an MCP Server that can integrate various Linux debugging and tracing tools, including bpftrace, perf, ftrace, strace, and others, with support for future expansion of additional tools.

    2. Perform testing by intentionally creating bugs or issues that impact system performance, allowing an AI agent to analyze the root cause and identify the underlying problem.

    Resources

    • Gemini CLI: https://geminicli.com/
    • eBPF: https://ebpf.io/
    • bpftrace: https://github.com/bpftrace/bpftrace/
    • perf: https://perfwiki.github.io/main/
    • ftrace: https://github.com/r1chard-lyu/tracium/


    Background Coding Agent by mmanno

    Description

    I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.

    Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.

    So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.

    Can we make prompts and workflows smart enough to succeed at background coding?

    Goals

    We will build a platform that allows engineers to execute complex code transformations using prompts.

    By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.

    Our platform will consist of three main components:

    • "Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
      • The target repositories.
      • A wrapper to run a "coding agent", e.g., "gemini-cli".
      • The task as a natural language prompt.
    • "Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
    • Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.

    MVP

    • Define the Change manifest format.
    • Build the core Management Service that can accept and queue a Change.
    • Connect management service and runners, dynamically dispatch jobs to runners.
    • Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.

    Stretch Goals:

    • Multi-layered approach, Workflow Agents trigger Coding Agents:
      1. Workflow Agent: Gather information about the task interactively from the user.
      2. Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
    • Use MCP:
      1. Workflow Agent gathers context information from Slack, Github, etc.
      2. Workflow Agent triggers a Coding Agent.
    • Create a "Standard Task" library with reliable prompts.
      1. Rebasing rancher-monitoring to a new version of kube-prom-stack
      2. Update charts to use new images
      3. Apply changes to comply with a new linter
      4. Bump complex Go dependencies, like k8s modules
      5. Backport pull requests to other branches
    • Add “review agents” that review the generated PR.

    See also


    AI-Powered Unit Test Automation for Agama by joseivanlopez

    The Agama project is a multi-language Linux installer that leverages the distinct strengths of several key technologies:

    • Rust: Used for the back-end services and the core HTTP API, providing performance and safety.
    • TypeScript (React/PatternFly): Powers the modern web user interface (UI), ensuring a consistent and responsive user experience.
    • Ruby: Integrates existing, robust YaST libraries (e.g., yast-storage-ng) to reuse established functionality.

    The Problem: Testing Overhead

    Developing and maintaining code across these three languages requires a significant, tedious effort in writing, reviewing, and updating unit tests for each component. This high cost of testing is a drain on developer resources and can slow down the project's evolution.

    The Solution: AI-Driven Automation

    This project aims to eliminate the manual overhead of unit testing by exploring and integrating AI-driven code generation tools. We will investigate how AI can:

    1. Automatically generate new unit tests as code is developed.
    2. Intelligently correct and update existing unit tests when the application code changes.

    By automating this crucial but monotonous task, we can free developers to focus on feature implementation and significantly improve the speed and maintainability of the Agama codebase.

    Goals

    • Proof of Concept: Successfully integrate and demonstrate an authorized AI tool (e.g., gemini-cli) to automatically generate unit tests.
    • Workflow Integration: Define and document a new unit test automation workflow that seamlessly integrates the selected AI tool into the existing Agama development pipeline.
    • Knowledge Sharing: Establish a set of best practices for using AI in code generation, sharing the learned expertise with the broader team.

    Contribution & Resources

    We are seeking contributors interested in AI-powered development and improving developer efficiency. Whether you have previous experience with code generation tools or are eager to learn, your participation is highly valuable.

    If you want to dive deep into AI for software quality, please reach out and join the effort!

    • Authorized AI Tools: Tools supported by SUSE (e.g., gemini-cli)
    • Focus Areas: Rust, TypeScript, and Ruby components within the Agama project.

    Interesting Links


    Explore LLM evaluation metrics by thbertoldi

    Description

    Learn the best practices for evaluating LLM performance with an open-source framework such as DeepEval.

    Goals

    Curate the knowledge learned during practice and present it to colleagues.

    -> Maybe publish a blog post on SUSE's blog?

    Resources

    https://deepeval.com

    https://docs.pactflow.io/docs/bi-directional-contract-testing


    Explore LLM evaluation metrics by thbertoldi

    Description

    Learn the best practices for evaluating LLM performance with an open-source framework such as DeepEval.

    Goals

    Curate the knowledge learned during practice and present it to colleagues.

    -> Maybe publish a blog post on SUSE's blog?

    Resources

    https://deepeval.com

    https://docs.pactflow.io/docs/bi-directional-contract-testing


    Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0

    Self-Scaling LLM Infrastructure Powered by Rancher

    logo


    Description

    The Problem

    Running LLMs can get expensive and complex pretty quickly.

    Today there are typically two choices:

    1. Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
    2. Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.

    What if there was a middle ground?

    What if infrastructure scaled itself instead of making you scale it?

    Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?

    Project Repository: github.com/alexander-demicev/llmserverless


    What This Project Does

    A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.

    A complete, self-scaling LLM infrastructure that:

    • Scales to zero when idle (no idle costs)
    • Scales up automatically when requests come in
    • Adds more nodes when needed, removes them when demand drops
    • Runs on any infrastructure - laptop, bare metal, or cloud

    Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.

    How It Works

    A combination of open source tools working together:

    Flow:

    • Users interact with OpenWebUI (chat interface)
    • Requests go to LiteLLM Gateway
    • LiteLLM routes requests to:
      • Ollama (Knative) for local model inference (auto-scales pods)
      • Or cloud APIs for fallback


    SUSE Observability MCP server by drutigliano

    Description

    The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

    This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

    Goals

    • Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
    • Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
    • Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
    • Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

     Hackweek STEP

    • Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

     Scope

    • Implement read-only MCP server that can:
      • Connect to a live SUSE Observability instance and authenticate (with API token)
      • Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
      • Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
      • Return the data as a structured JSON payload compliant with the MCP specification.

    Deliverables

    • MCP Server v0.1 A running Python web server (e.g., using FastAPI) with at least one tool.
    • A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

    Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

    Resources

    • https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
    • https://www.datadoghq.com/blog/datadog-remote-mcp-server
    • https://modelcontextprotocol.io/specification/2025-06-18/index
    • https://modelcontextprotocol.io/docs/develop/build-server

     Basic implementation

    • https://github.com/drutigliano19/suse-observability-mcp-server


    Extended private brain - RAG my own scripts and data into offline LLM AI by tjyrinki_suse

    Description

    For purely studying purposes, I'd like to find out if I could teach an LLM some of my own accumulated knowledge, to use it as a sort of extended brain.

    I might use qwen3-coder or something similar as a starting point.

    Everything would be done 100% offline without network available to the container, since I prefer to see when network is needed, and make it so it's never needed (other than initial downloads).

    Goals

    1. Learn something about RAG, LLM, AI.
    2. Find out if everything works offline as intended.
    3. As an end result have a new way to access my own existing know-how, but so that I can query the wisdom in them.
    4. Be flexible to pivot in any direction, as long as there are new things learned.

    Resources

    To be found on the fly.

    Timeline

    Day 1 (of 4)

    • Tried out a RAG demo, expanded on feeding it my own data
    • Experimented with qwen3-coder to add a persistent chat functionality, and keeping vectors in a pickle file
    • Optimizations to keep everything within context window
    • Learn and add a bit of PyTest

    Day 2

    • More experimenting and more data
    • Study ChromaDB
    • Add a Web UI that works from another computer even though the container sees network is down

    Day 3

    • The above RAG is working well enough for demonstration purposes.
    • Pivot to trying out OpenCode, configuring local Ollama qwen3-coder there, to analyze the RAG demo.
    • Figured out how to configure Ollama template to be usable under OpenCode. OpenCode locally is super slow to just running qwen3-coder alone.

    Day 4 (final day)

    • Battle with OpenCode that was both slow and kept on piling up broken things.
    • Call it success as after all the agentic AI was working locally.
    • Clean up the mess left behind a bit.

    Blog Post

    Summarized the findings at blog post.


    "what is it" file and directory analysis via MCP and local LLM, for console and KDE by rsimai

    Description

    Users sometimes wonder what files or directories they find on their local PC are good for. If they can't determine from the filename or metadata, there should an easy way to quickly analyze the content and at least guess the meaning. An LLM could help with that, through the use of a filesystem MCP and to-text-converters for typical file types. Ideally this is integrated into the desktop environment but works as well from a console. All data is processed locally or "on premise", no artifacts remain or leave the system.

    Goals

    • The user can run a command from the console, to check on a file or directory
    • The filemanager contains the "analyze" feature within the context menu
    • The local LLM could serve for other use cases where privacy matters

    TBD

    • Find or write capable one-shot and interactive MCP client
    • Find or write simple+secure file access MCP server
    • Create local LLM service with appropriate footprint, containerized
    • Shell command with options
    • KDE integration (Dolphin)
    • Package
    • Document

    Resources


    bpftrace contribution by mkoutny

    Description

    bpftrace is a great tool, no need to sing odes to it here. It can access any kernel data and process them in real time. It provides helpers for some common Linux kernel structures but not all.

    Goals

    • set up bpftrace toolchain
    • learn about bpftrace implementation and internals
    • implement support for percpu_counters
    • look into some of the first issues
    • send a refined PR (on Thu)

    Resources


    early stage kdump support by mbrugger

    Project Description

    When we experience a early boot crash, we are not able to analyze the kernel dump, as user-space wasn't able to load the crash system. The idea is to make the crash system compiled into the host kernel (think of initramfs) so that we can create a kernel dump really early in the boot process.

    Goal for the Hackweeks

    1. Investigate if this is possible and the implications it would have (done in HW21)
    2. Hack up a PoC (done in HW22 and HW23)
    3. Prepare RFC series (giving it's only one week, we are entering wishful thinking territory here).

    update HW23

    • I was able to include the crash kernel into the kernel Image.
    • I'll need to find a way to load that from init/main.c:start_kernel() probably after kcsan_init()
    • I workaround for a smoke test was to hack kexec_file_load() systemcall which has two problems:
      1. My initramfs in the porduction kernel does not have a new enough kexec version, that's not a blocker but where the week ended
      2. As the crash kernel is part of init.data it will be already stale once I can call kexec_file_load() from user-space.

    The solution is probably to rewrite the POC so that the invocation can be done from init.text (that's my theory) but I'm not sure if I can reuse the kexec infrastructure in the kernel from there, which I rely on heavily.

    update HW24

    • Day1
      • rebased on v6.12 with no problems others then me breaking the config
      • setting up a new compilation and qemu/virtme env
      • getting desperate as nothing works that used to work
    • Day 2
      • getting to call the invocation of loading the early kernel from __init after kcsan_init()
    • Day 3

      • fix problem of memdup not being able to alloc so much memory... use 64K page sizes for now
      • code refactoring
      • I'm now able to load the crash kernel
      • When using virtme I can boot into the crash kernel, also it doesn't boot completely (major milestone!), crash in elfcorehdr_read_notes()
    • Day 4

      • crash systems crashes (no pun intended) in copy_old_mempage() link; will need to understand elfcorehdr...
      • call path vmcore_init() -> parse_crash_elf_headers() -> elfcorehdr_read() -> read_from_oldmem() -> copy_oldmem_page() -> copy_to_iter()
    • Day 5

      • hacking arch/arm64/kernel/crash_dump.c:copy_old_mempage() to see if crash system really starts. It does.
      • fun fact: retested with more reserved memory and with UEFI FW, host kernel crashes in init but directly starts the crash kernel, so it works (somehow) \o/

    update HW25

    • Day 1
      • rebased crash-kernel on v6.12.59 (for now), still crashing


    Add Qualcomm Snapdragon 765G (SM7250) basic device tree to mainline linux kernel by pvorel

    Qualcomm Snapdragon 765G (SM7250) (smartphone SoC) has no support in the linux kernel, nor in u-boot. Try to add basic device tree support. The hardest part will be to create boot.img which will be accepted by phone.

    UART is available for smartphone :).


    Improve UML page fault handler by ptesarik

    Description

    Improve UML handling of segmentation faults in kernel mode. Although such page faults are generally caused by a kernel bug, it is annoying if they cause an infinite loop, or panic the kernel. More importantly, a robust implementation allows to write KUnit tests for various guard pages, preventing potential kernel self-protection regressions.

    Goals

    Convert the UML page fault handler to use oops_* helpers, go through a few review rounds and finally get my patch series merged in 6.14.

    Resources

    Wrong initial attempt: https://lore.kernel.org/lkml/20231215121431.680-1-petrtesarik@huaweicloud.com/T/


    pudc - A PID 1 process that barks to the internet by mssola

    Description

    As a fun exercise in order to dig deeper into the Linux kernel, its interfaces, the RISC-V architecture, and all the dragons in between; I'm building a blog site cooked like this:

    • The backend is written in a mixture of C and RISC-V assembly.
    • The backend is actually PID1 (for real, not within a container).
    • We poll and parse incoming HTTP requests ourselves.
    • The frontend is a mere HTML page with htmx.

    The project is meant to be Linux-specific, so I'm going to use io_uring, pidfs, namespaces, and Linux-specific features in order to drive all of this.

    I'm open for suggestions and so on, but this is meant to be a solo project, as this is more of a learning exercise for me than anything else.

    Goals

    • Have a better understanding of different Linux features from user space down to the kernel internals.
    • Most importantly: have fun.

    Resources