Project Description

For a few months, openSUSE community has the ability to host the openSUSE rpm repositories on a commercial CDN and it is in a slowly rolling out phase. There are however remaining potential bottlenecks and optimisation opportunities. My goal for this hack week is to investigate them and make reasonable progress on resolving them.

Topics that are in scope and being investigated:

  • Switch of repository metadata from gzip to the more modern (faster to decompress, smaller sizes) zstd
  • investigate zypper ref performance overheads
  • Understand Zsync and ZChunk and benchmark for tradeoffs
  • Leverage the CDN for delivering repository metadata and a mirrorlist so that the roundtrips to download.o.org can be reduced for users outside europe.
  • Investigate performance of HTTP/3 to see if it would benefit us

Goal for this Hackweek

Focus on openSUSE Tumbleweed

Looking for hackers with the skills:

opensuse cdn zypper rpm performance benchmark

This project is part of:

Hack Week 23

Activity

  • almost 2 years ago: mlschroe joined this project.
  • almost 2 years ago: favogt joined this project.
  • almost 2 years ago: favogt liked this project.
  • about 2 years ago: dirkmueller added keyword "cdn" to this project.
  • about 2 years ago: dirkmueller added keyword "zypper" to this project.
  • about 2 years ago: dirkmueller added keyword "rpm" to this project.
  • about 2 years ago: dirkmueller added keyword "performance" to this project.
  • about 2 years ago: dirkmueller added keyword "benchmark" to this project.
  • about 2 years ago: dirkmueller added keyword "opensuse" to this project.
  • about 2 years ago: dirkmueller started this project.
  • about 2 years ago: dirkmueller originated this project.

  • Comments

    • dirkmueller
      about 2 years ago by dirkmueller | Reply

      It turns out that createrepo_c was already preparing the switch to Zstd and Zchunk, so the bulk of the work has been to fix various places in the Open Build Service and product building logic to handle that. The submissions are planned to go live on Nov 13th.

      This provides overall a 10-30% reduction in download size for repository data and removes a few hundred ms in decompression time.

    • dirkmueller
      about 2 years ago by dirkmueller | Reply

      @mlschroe made a patch to libsolv that removes ~ 500ms from the parsing time for the tumbleweed repositories: https://github.com/openSUSE/libsolv/commit/23cbed3219bd07b5c3fa1ed8a6f2fa6c478c0fdb

    • dirkmueller
      about 2 years ago by dirkmueller | Reply

      I've spend several days on profiling and tuning zchunk support for the tumbleweed usecase (updating from tumbleweed snapshots) and made some upstream contributions to createrepo_c and zchunk to allow for these tunings.

    • dirkmueller
      about 2 years ago by dirkmueller | Reply

      @mlschroe worked on reviving zsync support in libzypp's default multicurl implementation

    • dirkmueller
      about 2 years ago by dirkmueller | Reply

      I've submitted necessary changes to enable curl with http/3 support. However this requires either switching to gnutls (which the security team doesn't like) or include the quictls patches for openssl. submitted the latter and waiting for maintainer review.

    Similar Projects

    Create openSUSE images for Arm/RISC-V boards by avicenzi

    Project Description

    Create openSUSE images (or test generic EFI images) for Arm and/or RISC-V boards that are not yet supported.

    Goal for this Hackweek

    Create bootable images of Tumbleweed for SBCs that currently have no images available or are untested.

    Consider generic EFI images where possible, as some boards can hold a bootloader.

    Document in the openSUSE Wiki how to flash and use the image for a given board.

    Hack Week 22

    Hack Week 21

    Resources


    A CLI for Harvester by mohamed.belgaied

    Harvester does not officially come with a CLI tool, the user is supposed to interact with Harvester mostly through the UI. Though it is theoretically possible to use kubectl to interact with Harvester, the manipulation of Kubevirt YAML objects is absolutely not user friendly. Inspired by tools like multipass from Canonical to easily and rapidly create one of multiple VMs, I began the development of Harvester CLI. Currently, it works but Harvester CLI needs some love to be up-to-date with Harvester v1.0.2 and needs some bug fixes and improvements as well.

    Project Description

    Harvester CLI is a command line interface tool written in Go, designed to simplify interfacing with a Harvester cluster as a user. It is especially useful for testing purposes as you can easily and rapidly create VMs in Harvester by providing a simple command such as: harvester vm create my-vm --count 5 to create 5 VMs named my-vm-01 to my-vm-05.

    asciicast

    Harvester CLI is functional but needs a number of improvements: up-to-date functionality with Harvester v1.0.2 (some minor issues right now), modifying the default behaviour to create an opensuse VM instead of an ubuntu VM, solve some bugs, etc.

    Github Repo for Harvester CLI: https://github.com/belgaied2/harvester-cli

    Done in previous Hackweeks

    • Create a Github actions pipeline to automatically integrate Harvester CLI to Homebrew repositories: DONE
    • Automatically package Harvester CLI for OpenSUSE / Redhat RPMs or DEBs: DONE

    Goal for this Hackweek

    The goal for this Hackweek is to bring Harvester CLI up-to-speed with latest Harvester versions (v1.3.X and v1.4.X), and improve the code quality as well as implement some simple features and bug fixes.

    Some nice additions might be: * Improve handling of namespaced objects * Add features, such as network management or Load Balancer creation ? * Add more unit tests and, why not, e2e tests * Improve CI * Improve the overall code quality * Test the program and create issues for it

    Issue list is here: https://github.com/belgaied2/harvester-cli/issues

    Resources

    The project is written in Go, and using client-go the Kubernetes Go Client libraries to communicate with the Harvester API (which is Kubernetes in fact). Welcome contributions are:

    • Testing it and creating issues
    • Documentation
    • Go code improvement

    What you might learn

    Harvester CLI might be interesting to you if you want to learn more about:

    • GitHub Actions
    • Harvester as a SUSE Product
    • Go programming language
    • Kubernetes API
    • Kubevirt API objects (Manipulating VMs and VM Configuration in Kubernetes using Kubevirt)


    RMT.rs: High-Performance Registration Path for RMT using Rust by gbasso

    Description

    The SUSE Repository Mirroring Tool (RMT) is a critical component for managing software updates and subscriptions, especially for our Public Cloud Team (PCT). In a cloud environment, hundreds or even thousands of new SUSE instances (VPS/EC2) can be provisioned simultaneously. Each new instance attempts to register against an RMT server, creating a "thundering herd" scenario.

    We have observed that the current RMT server, written in Ruby, faces performance issues under this high-concurrency registration load. This can lead to request overhead, slow registration times, and outright registration failures, delaying the readiness of new cloud instances.

    This Hackweek project aims to explore a solution by re-implementing the performance-critical registration path in Rust. The goal is to leverage Rust's high performance, memory safety, and first-class concurrency handling to create an alternative registration endpoint that is fast, reliable, and can gracefully manage massive, simultaneous request spikes.

    The new Rust module will be integrated into the existing RMT Ruby application, allowing us to directly compare the performance of both implementations.

    Goals

    The primary objective is to build and benchmark a high-performance Rust-based alternative for the RMT server registration endpoint.

    Key goals for the week:

    1. Analyze & Identify: Dive into the SUSE/rmt Ruby codebase to identify and map out the exact critical path for server registration (e.g., controllers, services, database interactions).
    2. Develop in Rust: Implement a functionally equivalent version of this registration logic in Rust.
    3. Integrate: Explore and implement a method for Ruby/Rust integration to "hot-wire" the new Rust module into the RMT application. This may involve using FFI, or libraries like rb-sys or magnus.
    4. Benchmark: Create a benchmarking script (e.g., using k6, ab, or a custom tool) that simulates the high-concurrency registration load from thousands of clients.
    5. Compare & Present: Conduct a comparative performance analysis (requests per second, latency, success/error rates, CPU/memory usage) between the original Ruby path and the new Rust path. The deliverable will be this data and a summary of the findings.

    Resources

    • RMT Source Code (Ruby):
      • https://github.com/SUSE/rmt
    • RMT Documentation:
      • https://documentation.suse.com/sles/15-SP7/html/SLES-all/book-rmt.html
    • Tooling & Stacks:
      • RMT/Ruby development environment (for running the base RMT)
      • Rust development environment (rustup, cargo)
    • Potential Integration Libraries:
      • rb-sys: https://github.com/oxidize-rb/rb-sys
      • Magnus: https://github.com/matsadler/magnus
    • Benchmarking Tools:
      • k6 (https://k6.io/)
      • ab (ApacheBench)