Description

Large language models like ChatGPT have demonstrated remarkable capabilities across a variety of applications. However, their potential for enhancing the Linux development and user ecosystem remains largely unexplored. This project seeks to bridge that gap by researching practical applications of LLMs to improve workflows in areas such as backporting, packaging, log analysis, system migration, and more. By identifying patterns that LLMs can leverage, we aim to uncover new efficiencies and automation strategies that can benefit developers, maintainers, and end users alike.

Goals

  • Evaluate Existing LLM Capabilities: Research and document the current state of LLM usage in open-source and Linux development projects, noting successes and limitations.
  • Prototype Tools and Scripts: Develop proof-of-concept scripts or tools that leverage LLMs to perform specific tasks like automated log analysis, assisting with backporting patches, or generating packaging metadata.
  • Assess Performance and Reliability: Test the tools' effectiveness on real-world Linux data and analyze their accuracy, speed, and reliability.
  • Identify Best Use Cases: Pinpoint which tasks are most suitable for LLM support, distinguishing between high-impact and impractical applications.
  • Document Findings and Recommendations: Summarize results with clear documentation and suggest next steps for potential integration or further development.

Resources

  • Local LLM Implementations: Access to locally hosted LLMs such as LLaMA, GPT-J, or similar open-source models that can be run and fine-tuned on local hardware.
  • Computing Resources: Workstations or servers capable of running LLMs locally, equipped with sufficient GPU power for training and inference.
  • Sample Data: Logs, source code, patches, and packaging data from openSUSE or SUSE repositories for model training and testing.
  • Public LLMs for Benchmarking: Access to APIs from platforms like OpenAI or Hugging Face for comparative testing and performance assessment.
  • Existing NLP Tools: Libraries such as spaCy, Hugging Face Transformers, and PyTorch for building and interacting with local LLMs.
  • Technical Documentation: Tutorials and resources focused on setting up and optimizing local LLMs for tasks relevant to Linux development.
  • Collaboration: Engagement with community experts and teams experienced in AI and Linux for feedback and joint exploration.

Looking for hackers with the skills:

ai

This project is part of:

Hack Week 24

Activity

  • about 1 year ago: PSuarezHernandez liked this project.
  • about 1 year ago: jiriwiesner liked this project.
  • about 1 year ago: anicka added keyword "ai" to this project.
  • about 1 year ago: moio liked this project.
  • about 1 year ago: livdywan liked this project.
  • about 1 year ago: mwilck liked this project.
  • about 1 year ago: bfilho liked this project.
  • about 1 year ago: vlefebvre liked this project.
  • about 1 year ago: wfrisch liked this project.
  • about 1 year ago: anicka started this project.
  • about 1 year ago: anicka originated this project.

  • Comments

    • wfrisch
      about 1 year ago by wfrisch | Reply

      If someone could recreate Google's Project Naptime, or at least something similar to it, that would be very interesting:

      Two key features:

      • Tool use in general
      • Tool-assisted verification of LLM results

    • jiriwiesner
      about 1 year ago by jiriwiesner | Reply

      I would like to ask an LLM instance about the inner workings on the Linux kernel code. It is a common task of mine to look for a bug in a subsystem or a layer that can easily have tens of thousands of lines of code (e.g. bsc 1216813). I know having an understanding of the Linux code is what we do as developers but my understanding and knowledge is always limited because I simply do not have the time to read all of the code possibly involved in an issue. If the LLM was trained to process the source code of a specific version of Linux a developer could then ask involved questions about the code using the terms found in the code base. It should basically be something that allows a developer find the interesting parts of the code better than when using just grep.

      • anicka
        about 1 year ago by anicka | Reply

        Actually, it looks like that off-the-shelf ChatGPT 4 can be already quite helpful in such tasks.

        But training something like code llama on our kernels is something I indeed want to look into next time because if there is any way how to leverage LLMs in our bugfixing or backporting, this is it.

    Similar Projects

    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    Explore LLM evaluation metrics by thbertoldi

    Description

    Learn the best practices for evaluating LLM performance with an open-source framework such as DeepEval.

    Goals

    Curate the knowledge learned during practice and present it to colleagues.

    -> Maybe publish a blog post on SUSE's blog?

    Resources

    https://deepeval.com

    https://docs.pactflow.io/docs/bi-directional-contract-testing


    The Agentic Rancher Experiment: Do Androids Dream of Electric Cattle? by moio

    Rancher is a beast of a codebase. Let's investigate if the new 2025 generation of GitHub Autonomous Coding Agents and Copilot Workspaces can actually tame it. A GitHub robot mascot trying to lasso a blue bull with a Kubernetes logo tatooed on it


    The Plan

    Create a sandbox GitHub Organization, clone in key Rancher repositories, and let the AI loose to see if it can handle real-world enterprise OSS maintenance - or if it just hallucinates new breeds of Kubernetes resources!

    Specifically, throw "Agentic Coders" some typical tasks in a complex, long-lived open-source project, such as:


    The Grunt Work: generate missing GoDocs, unit tests, and refactorings. Rebase PRs.

    The Complex Stuff: fix actual (historical) bugs and feature requests to see if they can traverse the complexity without (too much) human hand-holding.

    Hunting Down Gaps: find areas lacking in docs, areas of improvement in code, dependency bumps, and so on.


    If time allows, also experiment with Model Context Protocol (MCP) to give agents context on our specific build pipelines and CI/CD logs.

    Why?

    We know AI can write "Hello World." and also moderately complex programs from a green field. But can it rebase a 3-month-old PR with conflicts in rancher/rancher? I want to find the breaking point of current AI agents to determine if and how they can help us to reduce our technical debt, work faster and better. At the same time, find out about pitfalls and shortcomings.

    The CONCLUSION!!!

    A add-emoji State of the Union add-emoji document was compiled to summarize lessons learned this week. For more gory details, just read on the diary below! add-emoji


    Docs Navigator MCP: SUSE Edition by mackenzie.techdocs

    MCP Docs Navigator: SUSE Edition

    Description

    Docs Navigator MCP: SUSE Edition is an AI-powered documentation navigator that makes finding information across SUSE, Rancher, K3s, and RKE2 documentation effortless. Built as a Model Context Protocol (MCP) server, it enables semantic search, intelligent Q&A, and documentation summarization using 100% open-source AI models (no API keys required!). The project also allows you to bring your own keys from Anthropic and Open AI for parallel processing.

    Goals

    • [ X ] Build functional MCP server with documentation tools
    • [ X ] Implement semantic search with vector embeddings
    • [ X ] Create user-friendly web interface
    • [ X ] Optimize indexing performance (parallel processing)
    • [ X ] Add SUSE branding and polish UX
    • [ X ] Stretch Goal: Add more documentation sources
    • [ X ] Stretch Goal: Implement document change detection for auto-updates

    Coming Soon!

    • Community Feedback: Test with real users and gather improvement suggestions

    Resources


    SUSE Edge Image Builder MCP by eminguez

    Description

    Based on my other hackweek project, SUSE Edge Image Builder's Json Schema I would like to build also a MCP to be able to generate EIB config files the AI way.

    Realistically I don't think I'll be able to have something consumable at the end of this hackweek but at least I would like to start exploring MCPs, the difference between an API and MCP, etc.

    Goals

    • Familiarize myself with MCPs
    • Unrealistic: Have an MCP that can generate an EIB config file

    Resources

    Result

    https://github.com/e-minguez/eib-mcp

    I've extensively used antigravity and its agent mode to code this. This heavily uses https://hackweek.opensuse.org/25/projects/suse-edge-image-builder-json-schema for the MCP to be built.

    I've ended up learning a lot of things about "prompting", json schemas in general, some golang, MCPs and AI in general :)

    Example:

    Generate an Edge Image Builder configuration for an ISO image based on slmicro-6.2.iso, targeting x86_64 architecture. The output name should be 'my-edge-image' and it should install to /dev/sda. It should deploy a 3 nodes kubernetes cluster with nodes names "node1", "node2" and "node3" as: * hostname: node1, IP: 1.1.1.1, role: initializer * hostname: node2, IP: 1.1.1.2, role: agent * hostname: node3, IP: 1.1.1.3, role: agent The kubernetes version should be k3s 1.33.4-k3s1 and it should deploy a cert-manager helm chart (the latest one available according to https://cert-manager.io/docs/installation/helm/). It should create a user called "suse" with password "suse" and set ntp to "foo.ntp.org". The VIP address for the API should be 1.2.3.4

    Generates:

    ``` apiVersion: "1.0" image: arch: x86_64 baseImage: slmicro-6.2.iso imageType: iso outputImageName: my-edge-image kubernetes: helm: charts: - name: cert-manager repositoryName: jetstack