Description
Large language models like ChatGPT have demonstrated remarkable capabilities across a variety of applications. However, their potential for enhancing the Linux development and user ecosystem remains largely unexplored. This project seeks to bridge that gap by researching practical applications of LLMs to improve workflows in areas such as backporting, packaging, log analysis, system migration, and more. By identifying patterns that LLMs can leverage, we aim to uncover new efficiencies and automation strategies that can benefit developers, maintainers, and end users alike.
Goals
- Evaluate Existing LLM Capabilities: Research and document the current state of LLM usage in open-source and Linux development projects, noting successes and limitations.
- Prototype Tools and Scripts: Develop proof-of-concept scripts or tools that leverage LLMs to perform specific tasks like automated log analysis, assisting with backporting patches, or generating packaging metadata.
- Assess Performance and Reliability: Test the tools' effectiveness on real-world Linux data and analyze their accuracy, speed, and reliability.
- Identify Best Use Cases: Pinpoint which tasks are most suitable for LLM support, distinguishing between high-impact and impractical applications.
- Document Findings and Recommendations: Summarize results with clear documentation and suggest next steps for potential integration or further development.
Resources
- Local LLM Implementations: Access to locally hosted LLMs such as LLaMA, GPT-J, or similar open-source models that can be run and fine-tuned on local hardware.
- Computing Resources: Workstations or servers capable of running LLMs locally, equipped with sufficient GPU power for training and inference.
- Sample Data: Logs, source code, patches, and packaging data from openSUSE or SUSE repositories for model training and testing.
- Public LLMs for Benchmarking: Access to APIs from platforms like OpenAI or Hugging Face for comparative testing and performance assessment.
- Existing NLP Tools: Libraries such as spaCy, Hugging Face Transformers, and PyTorch for building and interacting with local LLMs.
- Technical Documentation: Tutorials and resources focused on setting up and optimizing local LLMs for tasks relevant to Linux development.
- Collaboration: Engagement with community experts and teams experienced in AI and Linux for feedback and joint exploration.
Looking for hackers with the skills:
This project is part of:
Hack Week 24
Activity
Comments
-
2 months ago by jiriwiesner | Reply
I would like to ask an LLM instance about the inner workings on the Linux kernel code. It is a common task of mine to look for a bug in a subsystem or a layer that can easily have tens of thousands of lines of code (e.g. bsc 1216813). I know having an understanding of the Linux code is what we do as developers but my understanding and knowledge is always limited because I simply do not have the time to read all of the code possibly involved in an issue. If the LLM was trained to process the source code of a specific version of Linux a developer could then ask involved questions about the code using the terms found in the code base. It should basically be something that allows a developer find the interesting parts of the code better than when using just grep.
-
2 months ago by anicka | Reply
Actually, it looks like that off-the-shelf ChatGPT 4 can be already quite helpful in such tasks.
But training something like code llama on our kernels is something I indeed want to look into next time because if there is any way how to leverage LLMs in our bugfixing or backporting, this is it.
-
Similar Projects
Gen-AI chatbots and test-automation of generated responses by mdati
Description
Start experimenting the generative SUSE-AI chat bot, asking questions on different areas of knowledge or science and possibly analyze the quality of the LLM model response, specific and comparative, checking the answers provided by different LLM models to a same query, using proper quality metrics or tools or methodologies.
Try to define basic guidelines and requirements for quality test automation of AI-generated responses.
First approach of investigation can be based on manual testing: methodologies, findings and data can be useful then to organize valid automated testing.
Goals
- Identify criteria and measuring scales for assessment of a text content.
- Define quality of an answer/text based on defined criteria .
- Identify some knowledge sectors and a proper list of problems/questions per sector.
- Manually run query session and apply evaluation criteria to answers.
- Draft requirements for test automation of AI answers.
Resources
- Announcement of SUSE-AI for Hack Week in Slack
- Openplatform and related 3 LLM models gemma:2b, llama3.1:8b, qwen2.5-coder:3b.
Notes
Foundation models (FMs):
are large deep learning neural networks, trained on massive datasets, that have changed the way data scientists approach machine learning (ML). Rather than develop artificial intelligence (AI) from scratch, data scientists use a foundation model as a starting point to develop ML models that power new applications more quickly and cost-effectively.Large language models (LLMs):
are a category of foundation models pre-trained on immense amounts of data acquiring abilities by learning statistical relationships from vast amounts of text during a self- and semi-supervised training process, making them capable of understanding and generating natural language and other types of content , to perform a wide range of tasks.
LLMs can be used for generative AI (artificial intelligence) to produce content based on input prompts in human language.
Validation of a AI-generated answer is not an easy task to perform, as manually as automated.
An LLM answer text shall contain a given level of informations: correcness, completeness, reasoning description etc.
We shall rely in properly applicable and measurable criteria of validation to get an assessment in a limited amount of time and resources.
SUSE AI Meets the Game Board by moio
Use tabletopgames.ai’s open source TAG and PyTAG frameworks to apply Statistical Forward Planning and Deep Reinforcement Learning to two board games of our own design. On an all-green, all-open source, all-AWS stack!
Results: Infrastructure Achievements
We successfully built and automated a containerized stack to support our AI experiments. This included:
- a Fully-Automated, One-Command, GPU-accelerated Kubernetes setup: we created an OpenTofu based script, tofu-tag, to deploy SUSE's RKE2 Kubernetes running on CUDA-enabled nodes in AWS, powered by openSUSE with GPU drivers and gpu-operator
- Containerization of the TAG and PyTAG frameworks: TAG (Tabletop AI Games) and PyTAG were patched for seamless deployment in containerized environments. We automated the container image creation process with GitHub Actions. Our forks (PRs upstream upcoming):
./deploy.sh
and voilà - Kubernetes running PyTAG (k9s
, above) with GPU acceleration (nvtop
, below)
Results: Game Design Insights
Our project focused on modeling and analyzing two card games of our own design within the TAG framework:
- Game Modeling: We implemented models for Dario's "Bamboo" and Silvio's "Totoro" and "R3" games, enabling AI agents to play thousands of games ...in minutes!
- AI-driven optimization: By analyzing statistical data on moves, strategies, and outcomes, we iteratively tweaked the game mechanics and rules to achieve better balance and player engagement.
- Advanced analytics: Leveraging AI agents with Monte Carlo Tree Search (MCTS) and random action selection, we compared performance metrics to identify optimal strategies and uncover opportunities for game refinement .
- more about Bamboo on Dario's site
- more about R3 on Silvio's site (italian, translation coming)
- more about Totoro on Silvio's site
A family picture of our card games in progress. From the top: Bamboo, Totoro, R3
Results: Learning, Collaboration, and Innovation
Beyond technical accomplishments, the project showcased innovative approaches to coding, learning, and teamwork:
- "Trio programming" with AI assistance: Our "trio programming" approach—two developers and GitHub Copilot—was a standout success, especially in handling slightly-repetitive but not-quite-exactly-copypaste tasks. Java as a language tends to be verbose and we found it to be fitting particularly well.
- AI tools for reporting and documentation: We extensively used AI chatbots to streamline writing and reporting. (Including writing this report! ...but this note was added manually during edit!)
- GPU compute expertise: Overcoming challenges with CUDA drivers and cloud infrastructure deepened our understanding of GPU-accelerated workloads in the open-source ecosystem.
- Game design as a learning platform: By blending AI techniques with creative game design, we learned not only about AI strategies but also about making games fun, engaging, and balanced.
Last but not least we had a lot of fun! ...and this was definitely not a chatbot generated line!
The Context: AI + Board Games
COOTWbot by ngetahun
Project Description
At SCC, we have a rotating task of COOTW (Commanding Office of the Week). This task involves responding to customer requests from jira and slack help channels, monitoring production systems and doing small chores. Usually, we have documentation to help the COOTW answer questions and quickly find fixes. Most of these are distributed across github, trello and SUSE Support documentation. The aim of this project is to explore the magic of LLMs and create a conversational bot.
Goal for this Hackweek
- Build data ingestion
Data source:
- SUSE KB docs
- scc github docs
- scc trello knowledge board
Test out new RAG architecture
https://gitlab.suse.de/ngetahun/cootwbot
Make more sense of openQA test results using AI by livdywan
Description
AI has the potential to help with something many of us spend a lot of time doing which is making sense of openQA logs when a job fails.
User Story
Allison Average has a puzzled look on their face while staring at log files that seem to make little sense. Is this a known issue, something completely new or maybe related to infrastructure changes?
Goals
- Leverage a chat interface to help Allison
- Create a model from scratch based on data from openQA
- Proof of concept for automated analysis of openQA test results
Bonus
- Use AI to suggest solutions to merge conflicts
- This would need a merge conflict editor that can suggest solving the conflict
- Use image recognition for needles
Resources
Timeline
Day 1
- Conversing with open-webui to teach me how to create a model based on openQA test results
- Asking for example code using TensorFlow in Python
- Discussing log files to explore what to analyze
- Drafting a new project called Testimony (based on Implementing a containerized Python action) - the project name was also suggested by the assistant
Day 2
- Using NotebookLLM (Gemini) to produce conversational versions of blog posts
- Researching the possibility of creating a project logo with AI
- Asking open-webui, persons with prior experience and conducting a web search for advice
Highlights
- I briefly tested compared models to see if they would make me more productive. Between llama, gemma and mistral there was no amazing difference in the results for my case.
- Convincing the chat interface to produce code specific to my use case required very explicit instructions.
- Asking for advice on how to use open-webui itself better was frustratingly unfruitful both in trivial and more advanced regards.
- Documentation on source materials used by LLM's and tools for this purpose seems virtually non-existent - specifically if a logo can be generated based on particular licenses
Outcomes
- Chat interface-supported development is providing good starting points and open-webui being open source is more flexible than Gemini. Although currently some fancy features such as grounding and generated podcasts are missing.
- Allison still has to be very experienced with openQA to use a chat interface for test review. Publicly available system prompts would make that easier, though.
Use AI tools to convert legacy perl scripts to bash by nadvornik
Description
Use AI tools to convert legacy perl scripts to bash
Goals
Uyuni project contains legacy perl scripts used for setup. The perl dependency could be removed, to reduce the container size. The goal of this project is to research use of AI tools for this task.
Resources
Results:
Aider is not the right tool for this. It works ok for small changes, but not for complete rewrite from one language to another.
I got better results with direct API use from script.