Description

Setup a local AI assistant for research, brainstorming and proof reading. Look into SurfSense, Open WebUI and possibly alternatives. Explore integration with services like openQA. There should be no cloud dependencies. Mobile phone support or an additional companion app would be a bonus. The goal is not to develop everything from scratch.

User Story

  • Allison Average wants a one-click local AI assistent on their openSUSE laptop.
  • Ash Awesome wants AI on their phone without an expensive subscription.

Goals

  • Evaluate a local SurfSense setup for day to day productivity
  • Test opencode for vibe coding and tool calling

Timeline

Day 1

  • Took a look at SurfSense and started setting up a local instance.
  • Unfortunately the container setup did not work well. Tho this was a great opportunity to learn some new podman commands and refresh my memory on how to recover a corrupted btrfs filesystem.

Day 2

  • Due to its sheer size and complexity SurfSense seems to have triggered btrfs fragmentation. Naturally this was not visible in any podman-related errors or in the journal. So this took up much of my second day.

Day 3

Day 4

  • Context size is a thing, and models are not equally usable for vibe coding.
  • Through arduous browsing for ollama models I did find some like myaniu/qwen2.5-1m:7b with 1m but even then it is not obvious if they are meant for tool calls.

Day 5

  • Whilst trying to make opencode usable I discovered ramalama which worked instantly and very well.

Outcomes

surfsense

I could not easily set this up completely. Maybe in part due to my filesystem issues. Was expecting this to be less of an effort.

opencode

Installing opencode and ollama in my distrobox container along with the following configs worked for me.

When preparing a new project from scratch it is a good idea to start out with a template.

opencode.json

{ "$schema": "https://opencode.ai/config.json", "theme": "catppuccin", "model": "ollama/qwen2.5-coder:1.5b", "mode": { "plan": { "temperature": 0.0 }, "build": { "temperature": 0.0 } }, "provider": { "ollama": { "npm": "[@ai-sdk](/users/ai-sdk)/openai-compatible", "name": "Ollama (local)", "options": { "baseURL": "http://localhost:11434/v1" }, "models": { "qwen2.5-coder:1.5b": { "name": "Qwem2.5-Coder" } } } }, "mcp": { "openqa": { "type": "remote", "enabled": true, "url": "https://openqa.opensuse.org/experimental/mcp", "headers": { "Authorization": "Bearer {env:OPENQA_USER}:{env:OPENQA_APIKEY}:{env:OPENQA_APISECRET}" } }, "gh_grep": { "type": "remote", "url": "https://mcp.grep.app" } } }

The models need to be ollama pulled first, and ollama needs to be serving.

AGENTS.md

Agents can be instruced per project or globally like so:

When you need to lookup openQA jobs or job groups, use `openqa` tools. If you are unsure how to do something, use `gh_grep` to search code examples from github.

Note: My results varied a lot between models. Available context length e.g. OLLAMA_CONTEXT_LENGTH=8192 ollama serve & gives it more wiggle room and lowering the temerature should also help, but I found myself tweaking the configuration a lot.

Horrible performance even with small models

Normally I don't hear the fan in this laptop much. Responses were processed so slowly by opencode that I barely got much done. Even figuring out why responses were unreliable took longer because I had to wait a lot for useless responses.

Airgapped models

While Investigating the horrible performance of opencode I stumbled upon ramalama which runs models in containers optimized for different cpu's which are also isolated:

ramalama serve --ctx-size 8192 -p 8080 -d kirito1/qwen3-coder:1.7b

I could not get it to work with opencode which just silently failed to communicate with it. Even so, ramalama is awesome.

Looking for hackers with the skills:

ai

This project is part of:

Hack Week 25

Activity

  • about 1 month ago: livdywan added keyword "ai" to this project.
  • about 1 month ago: livdywan started this project.
  • about 2 months ago: rsimai liked this project.
  • 2 months ago: livdywan originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    GenAI-Powered Systemic Bug Evaluation and Management Assistant by rtsvetkov

    Motivation

    What is the decision critical question which one can ask on a bug? How this question affects the decision on a bug and why?

    Let's make GenAI look on the bug from the systemic point and evaluate what we don't know. Which piece of information is missing to take a decision?

    Description

    To build a tool that takes a raw bug report (including error messages and context) and uses a large language model (LLM) to generate a series of structured, Socratic-style or Systemic questions designed to guide a the integration and development toward the root cause, rather than just providing a direct, potentially incorrect fix.

    Goals

    Set up a Python environment

    Set the environment and get a Gemini API key. 2. Collect 5-10 realistic bug reports (from open-source projects, personal projects, or public forums like Stack Overflow—include the error message and the initial context).

    Build the Dialogue Loop

    1. Write a basic Python script using the Gemini API.
    2. Implement a simple conversational loop: User Input (Bug) -> AI Output (Question) -> User Input (Answer to AI's question) -> AI Output (Next Question). Code Implementation

    Socratic/Systemic Strategy Implementation

    1. Refine the logic to ensure the questions follow a Socratic and Systemic path (e.g., from symptom-> context -> assumptions -> -> critical parts -> ).
    2. Implement Function Calling (an advanced feature of the Gemini API) to suggest specific actions to the user, like "Run a ping test" or "Check the database logs."
    3. Implement Bugzillla call to collect the
    4. Implement Questioning Framework as LLVM pre-conditioning
    5. Define set of instructions
    6. Assemble the Tool

    Resources

    What are Systemic Questions?

    Systemic questions explore the relationships, patterns, and interactions within a system rather than focusing on isolated elements.
    In IT, they help uncover hidden dependencies, feedback loops, assumptions, and side-effects during debugging or architecture analysis.

    Gitlab Project

    gitlab.suse.de/sle-prjmgr/BugDecisionCritical_Question


    Self-Scaling LLM Infrastructure Powered by Rancher by ademicev0

    Self-Scaling LLM Infrastructure Powered by Rancher

    logo


    Description

    The Problem

    Running LLMs can get expensive and complex pretty quickly.

    Today there are typically two choices:

    1. Use cloud APIs like OpenAI or Anthropic. Easy to start with, but costs add up at scale.
    2. Self-host everything - set up Kubernetes, figure out GPU scheduling, handle scaling, manage model serving... it's a lot of work.

    What if there was a middle ground?

    What if infrastructure scaled itself instead of making you scale it?

    Can we use existing Rancher capabilities like CAPI, autoscaling, and GitOps to make this simpler instead of building everything from scratch?

    Project Repository: github.com/alexander-demicev/llmserverless


    What This Project Does

    A key feature is hybrid deployment: requests can be routed based on complexity or privacy needs. Simple or low-sensitivity queries can use public APIs (like OpenAI), while complex or private requests are handled in-house on local infrastructure. This flexibility allows balancing cost, privacy, and performance - using cloud for routine tasks and on-premises resources for sensitive or demanding workloads.

    A complete, self-scaling LLM infrastructure that:

    • Scales to zero when idle (no idle costs)
    • Scales up automatically when requests come in
    • Adds more nodes when needed, removes them when demand drops
    • Runs on any infrastructure - laptop, bare metal, or cloud

    Think of it as "serverless for LLMs" - focus on building, the infrastructure handles itself.

    How It Works

    A combination of open source tools working together:

    Flow:

    • Users interact with OpenWebUI (chat interface)
    • Requests go to LiteLLM Gateway
    • LiteLLM routes requests to:
      • Ollama (Knative) for local model inference (auto-scales pods)
      • Or cloud APIs for fallback


    Explore LLM evaluation metrics by thbertoldi

    Description

    Learn the best practices for evaluating LLM performance with an open-source framework such as DeepEval.

    Goals

    Curate the knowledge learned during practice and present it to colleagues.

    -> Maybe publish a blog post on SUSE's blog?

    Resources

    https://deepeval.com

    https://docs.pactflow.io/docs/bi-directional-contract-testing


    Backporting patches using LLM by jankara

    Description

    Backporting Linux kernel fixes (either for CVE issues or as part of general git-fixes workflow) is boring and mostly mechanical work (dealing with changes in context, renamed variables, new helper functions etc.). The idea of this project is to explore usage of LLM for backporting Linux kernel commits to SUSE kernels using LLM.

    Goals

    • Create safe environment allowing LLM to run and backport patches without exposing the whole filesystem to it (for privacy and security reasons).
    • Write prompt that will guide LLM through the backporting process. Fine tune it based on experimental results.
    • Explore success rate of LLMs when backporting various patches.

    Resources

    • Docker
    • Gemini CLI

    Repository

    Current version of the container with some instructions for use are at: https://gitlab.suse.de/jankara/gemini-cli-backporter


    SUSE Observability MCP server by drutigliano

    Description

    The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

    This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

    Goals

    • Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
    • Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
    • Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
    • Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

     Hackweek STEP

    • Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

     Scope

    • Implement read-only MCP server that can:
      • Connect to a live SUSE Observability instance and authenticate (with API token)
      • Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
      • Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
      • Return the data as a structured JSON payload compliant with the MCP specification.

    Deliverables

    • MCP Server v0.1 A running Golang MCP server with at least one tool.
    • A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

    Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

    Resources

    • https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
    • https://www.datadoghq.com/blog/datadog-remote-mcp-server
    • https://modelcontextprotocol.io/specification/2025-06-18/index
    • https://modelcontextprotocol.io/docs/develop/build-server

     Basic implementation

    • https://github.com/drutigliano19/suse-observability-mcp-server

    Results

    Successfully developed and delivered a fully functional SUSE Observability MCP Server that bridges language models with SUSE Observability's operational data. This project demonstrates how AI agents can perform intelligent troubleshooting and root cause analysis using structured access to real-time infrastructure data.

    Example execution