Project Description

Phoeβe (/ˈfiːbi/) wants to add basic artificial intelligence capabilities to the Linux OS.

System-level tuning is a very complex activity, requiring the knowledge and expertise of several (all?) layers which compose the system itself, how they interact with each other and (quite often) it is required to also have an intimate knowledge of the implementation of the various layers.

Another big aspect of running systems is dealing with failure. Do not think of failure as a machine turning on fire rather as an overloaded system, caused by misconfiguration, which could lead to starvation of the available resources.

In many circumstances, operators are used to deal with telemetry, live charts, alerts, etc. which could help them identifying the offending machine(s) and (re)act to fix any potential issues.

However, one question comes to mind: wouldn't it be awesome if the machine could auto-tune itself and provide a self-healing capability to the user? Well, if that is enough to trigger your interest then this is what Phoeβe aims to provide.

Phoeβe uses system telemetry as the input to its brain and produces a big set of settings which get applied to the running system. The decision made by the brain is continuously reevaluated (considering the grace_period setting) to offer eventually the best possible setup.

Goal for this Hackweek

Work mostly on two main areas:

1) Rework the data engineering part of Phoebe to add tags/labels to individual data field to be used by the model;

2) Update the model according to the data re-engineering

3) Create a tool to assist Phoebe with data manipulation so to move away from CSV files

Stretch goal: have a proper lab setup to consistently test and validate Phoebe and generate data.

Resources

URL: https://github.com/SUSE/phoebe

Events in calendar

Monday 22nd March 2021 @ 10:00 AM CEST - Meeting with Prof. Nicola Strisciuglio

Every day @ 9:00 AM CEST - Sync up on progress, opens and... have a coffee together :)

This project is part of:

Hack Week 20

Activity

  • over 4 years ago: asmorodskyi joined this project.
  • over 4 years ago: llansky3 liked this project.
  • over 4 years ago: dfaggioli liked this project.
  • over 4 years ago: hennevogel liked this project.
  • over 4 years ago: ybonatakis joined this project.
  • over 4 years ago: ybonatakis liked this project.
  • over 4 years ago: mlnoga liked this project.
  • over 4 years ago: tjyrinki_suse liked this project.
  • over 4 years ago: mvarlese added keyword "reliability" to this project.
  • over 4 years ago: mvarlese added keyword "performance" to this project.
  • over 4 years ago: mvarlese added keyword "self-healing" to this project.
  • over 4 years ago: mvarlese added keyword "tuning" to this project.
  • over 4 years ago: dmulder joined this project.
  • over 4 years ago: dancermak liked this project.
  • over 4 years ago: shunghsiyu joined this project.
  • over 4 years ago: shunghsiyu liked this project.
  • over 4 years ago: mvarlese joined this project.
  • over 4 years ago: mslacken started this project.
  • over 4 years ago: mvarlese added keyword "c" to this project.
  • over 4 years ago: mvarlese added keyword "python" to this project.
  • over 4 years ago: mvarlese added keyword "meson" to this project.
  • over 4 years ago: mvarlese added keyword "ci/cd" to this project.
  • over 4 years ago: mvarlese added keyword "machinelearning" to this project.
  • over 4 years ago: mvarlese added keyword "linux" to this project.
  • over 4 years ago: mvarlese added keyword "artificial-intelligence" to this project.
  • All Activity

    Comments

    • dmulder
      over 4 years ago by dmulder | Reply

      I'd like to see if I could use Phoebe to tune samba settings (long term goal). So, this hackweek I think I'll just familiarize myself with Phoebe.

      • mvarlese
        over 4 years ago by mvarlese | Reply

        That's awesome!!! A can see a new plugin coming to Phoebe :)

    • mlnoga
      over 4 years ago by mlnoga | Reply

      Great one, Marco. Two points to consider:

      1. What are the top 4-5 settings to tune?
      2. What kind of data is available for the team in HackWeek to learn settings? (sourcing during Hackweek is probably not a viable option)

      • mvarlese
        over 4 years ago by mvarlese | Reply

        Thanks for the feedback Markus! Will keep both points in mind.

    Similar Projects

    pudc - A PID 1 process that barks to the internet by mssola

    Description

    As a fun exercise in order to dig deeper into the Linux kernel, its interfaces, the RISC-V architecture, and all the dragons in between; I'm building a blog site cooked like this:

    • The backend is written in a mixture of C and RISC-V assembly.
    • The backend is actually PID1 (for real, not within a container).
    • We poll and parse incoming HTTP requests ourselves.
    • The frontend is a mere HTML page with htmx.

    The project is meant to be Linux-specific, so I'm going to use io_uring, pidfs, namespaces, and Linux-specific features in order to drive all of this.

    I'm open for suggestions and so on, but this is meant to be a solo project, as this is more of a learning exercise for me than anything else.

    Goals

    • Have a better understanding of different Linux features from user space down to the kernel internals.
    • Most importantly: have fun.

    Resources


    SUSE Observability MCP server by drutigliano

    Description

    The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

    This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

    Goals

    • Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
    • Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
    • Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
    • Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

     Hackweek STEP

    • Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

     Scope

    • Implement read-only MCP server that can:
      • Connect to a live SUSE Observability instance and authenticate (with API token)
      • Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
      • Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
      • Return the data as a structured JSON payload compliant with the MCP specification.

    Deliverables

    • MCP Server v0.1 A running Python web server (e.g., using FastAPI) with at least one tool.
    • A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

    Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

    Resources

    • https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
    • https://www.datadoghq.com/blog/datadog-remote-mcp-server
    • https://modelcontextprotocol.io/specification/2025-06-18/index
    • https://modelcontextprotocol.io/docs/develop/build-server

     Basic implementation

    • https://github.com/drutigliano19/suse-observability-mcp-server


    pudc - A PID 1 process that barks to the internet by mssola

    Description

    As a fun exercise in order to dig deeper into the Linux kernel, its interfaces, the RISC-V architecture, and all the dragons in between; I'm building a blog site cooked like this:

    • The backend is written in a mixture of C and RISC-V assembly.
    • The backend is actually PID1 (for real, not within a container).
    • We poll and parse incoming HTTP requests ourselves.
    • The frontend is a mere HTML page with htmx.

    The project is meant to be Linux-specific, so I'm going to use io_uring, pidfs, namespaces, and Linux-specific features in order to drive all of this.

    I'm open for suggestions and so on, but this is meant to be a solo project, as this is more of a learning exercise for me than anything else.

    Goals

    • Have a better understanding of different Linux features from user space down to the kernel internals.
    • Most importantly: have fun.

    Resources


    Port OTPClient to GTK >= 4.18 by pstivanin

    Project Description

    OTPClient is currently using GTK3 and cannot easily be ported to GTK4. Since GTK4 came out, there have been quite some big changes. Also, there are now some new deprecation that will take effect with GTK5 (and are active starting from 4.10 as warnings), so I need to think ahead and port OTPClient without using any of those deprecated features.

    Goal for this Hackweek

    • fix the last 3 opened issues (https://github.com/paolostivanin/OTPClient/issues/402, https://github.com/paolostivanin/OTPClient/issues/404, https://github.com/paolostivanin/OTPClient/issues/406) and release a new version
    • continue the rewrite from where we left last year
    • if possible, finally close this 6 years old issue: https://github.com/paolostivanin/OTPClient/issues/123


    Bring to Cockpit + System Roles capabilities from YAST by miguelpc

    Bring to Cockpit + System Roles features from YAST

    Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.

    Goals

    The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.

    Resources

    A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit

    Linux System Roles: https://linux-system-roles.github.io/


    Improve chore and screen time doc generator script `wochenplaner` by gniebler

    Description

    I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.

    I named this script wochenplaner and have been using it for a few months now.

    It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.

    Goals

    • Fix chore field separation lines
    • Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
    • Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.

    Resources

    tbd (Gitlab repo)


    Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

    Description

    git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

    Supports:

    • Verifying commit authenticity signed using gpg key
    • Checking out trusted commits

    Ideal for teams and projects where the integrity of git history is crucial.

    Goals

    A minimal python code of the shell script exists as a pull request.

    The goal of this hackweek is to:

    • Add more unit tests
    • Make the python code modular
    • Add code coverage if possible

    Resources


    Improve/rework household chore tracker `chorazon` by gniebler

    Description

    I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

    It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

    There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

    Days are not rolled over automatically, to allow for task completion control.

    We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

    It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

    Goals

    • Add admin screens for users, tasks and schedules
    • Add models, pages etc. to allow redeeming tokens for gifts/surprises
    • …?

    Resources

    tbd (Gitlab repo)


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    RMT.rs: High-Performance Registration Path for RMT using Rust by gbasso

    Description

    The SUSE Repository Mirroring Tool (RMT) is a critical component for managing software updates and subscriptions, especially for our Public Cloud Team (PCT). In a cloud environment, hundreds or even thousands of new SUSE instances (VPS/EC2) can be provisioned simultaneously. Each new instance attempts to register against an RMT server, creating a "thundering herd" scenario.

    We have observed that the current RMT server, written in Ruby, faces performance issues under this high-concurrency registration load. This can lead to request overhead, slow registration times, and outright registration failures, delaying the readiness of new cloud instances.

    This Hackweek project aims to explore a solution by re-implementing the performance-critical registration path in Rust. The goal is to leverage Rust's high performance, memory safety, and first-class concurrency handling to create an alternative registration endpoint that is fast, reliable, and can gracefully manage massive, simultaneous request spikes.

    The new Rust module will be integrated into the existing RMT Ruby application, allowing us to directly compare the performance of both implementations.

    Goals

    The primary objective is to build and benchmark a high-performance Rust-based alternative for the RMT server registration endpoint.

    Key goals for the week:

    1. Analyze & Identify: Dive into the SUSE/rmt Ruby codebase to identify and map out the exact critical path for server registration (e.g., controllers, services, database interactions).
    2. Develop in Rust: Implement a functionally equivalent version of this registration logic in Rust.
    3. Integrate: Explore and implement a method for Ruby/Rust integration to "hot-wire" the new Rust module into the RMT application. This may involve using FFI, or libraries like rb-sys or magnus.
    4. Benchmark: Create a benchmarking script (e.g., using k6, ab, or a custom tool) that simulates the high-concurrency registration load from thousands of clients.
    5. Compare & Present: Conduct a comparative performance analysis (requests per second, latency, success/error rates, CPU/memory usage) between the original Ruby path and the new Rust path. The deliverable will be this data and a summary of the findings.

    Resources

    • RMT Source Code (Ruby):
      • https://github.com/SUSE/rmt
    • RMT Documentation:
      • https://documentation.suse.com/sles/15-SP7/html/SLES-all/book-rmt.html
    • Tooling & Stacks:
      • RMT/Ruby development environment (for running the base RMT)
      • Rust development environment (rustup, cargo)
    • Potential Integration Libraries:
      • rb-sys: https://github.com/oxidize-rb/rb-sys
      • Magnus: https://github.com/matsadler/magnus
    • Benchmarking Tools:
      • k6 (https://k6.io/)
      • ab (ApacheBench)