This is a follow up to https://hackweek.suse.com/projects/architecting-a-machine-learning-project-with-suse-caasp.

In the last hackweek I learned that in order to run machine learning workflows on top of SUSE CaaSP, the missing piece is to have libnvidia-containers and nvidia-containers-runtime-hook packaged.

Since then, nvidia has added the build for leap15 in libnvidia-container and nvidia-container-runtime.

However, none of them is released into the libnvidia-container repo nor nvidia-container-runtime repo.

This project is about packaging those two projects in the openSUSE Build Service for openSUSE Leap 15.1.

Looking for hackers with the skills:

nvidia machinelearning containers

This project is part of:

Hack Week 19

Activity

  • almost 6 years ago: drdavis liked this project.
  • almost 6 years ago: afesta liked this project.
  • almost 6 years ago: jordimassaguerpla added keyword "nvidia" to this project.
  • almost 6 years ago: jordimassaguerpla added keyword "machinelearning" to this project.
  • almost 6 years ago: jordimassaguerpla added keyword "containers" to this project.
  • almost 6 years ago: a_faerber liked this project.
  • almost 6 years ago: jordimassaguerpla started this project.
  • almost 6 years ago: jordimassaguerpla originated this project.

  • Comments

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      First package ready: https://build.opensuse.org/package/show/home:jordimassaguerpla:nvidia_container/libnvidia-container And a Pull Request to upstream: https://github.com/NVIDIA/libnvidia-container/pull/77

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      Second package ready: https://build.opensuse.org/package/show/home:jordimassaguerpla:nvidia_container/nvidia-container-runtime-toolkit

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      Prove that this worked:

      On a workstation with Quadro K2000 with SLE15SP1:

      Installing nvidia graphics driver kernel module

      zypper ar https://download.nvidia.com/suse/sle15sp1/ nvidia
      zypper ref
      zypper install nvidia-gfxG05-kmp-default
      modprobe nvidia
      lsmod | grep nvidia
      

      Expected output:

      nvidia_drm             49152  0
      nvidia_modeset       1114112  1 nvidia_drm
      drm_kms_helper        204800  1 nvidia_drm
      drm                   536576  3 nvidia_drm,drm_kms_helper
      nvidia_uvm           1036288  0
      nvidia              20414464  2 nvidia_modeset,nvidia_uvm
      ipmi_msghandler       110592  2 nvidia,ipmi_devintf
      

      Installing NVIDIA driver for computing with GPUs using CUDA

      zypper install nvidia-computeG05
      nvidia-smi
      

      Expected output:

      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |===============================+======================+======================|
      |   0  Quadro K2000        Off  | 00000000:05:00.0 Off |                  N/A |
      | 30%   43C    P0    N/A /  N/A |      0MiB /  1997MiB |      0%      Default |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                       GPU Memory |
      |  GPU       PID   Type   Process name                             Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      

      Installing libnvidia-containers

      zypper ar https://download.opensuse.org/repositories/home:/jordimassaguerpla:/nvidia_container/SLE_15_SP1/ nvidia_container
      zypper install libnvidia-container
      usermod -G root USER
      usermod -G video USER
      

      USER should be a user in your system which is not root

      su - USER -c nvidia-container-cli info
      

      expected output

      NVRM version:   440.59
      CUDA version:   10.2
      
      Device Index:   0
      Device Minor:   0
      Model:          Quadro K2000
      Brand:          Quadro
      GPU UUID:       GPU-6a04b812-c20e-aeb6-9047-6382930eef7d
      Bus Location:   00000000:05:00.0
      Architecture:   3.0
      

      > NOTE: we need to use a different user that is not root for this test because the root user does not run with the video group by default. We will fix this later when installing the toolkit. If you use root, you will see this error message

      nvidia-container-cli: initialization error: cuda error: no cuda-capable device is detected
      

      Installing nvidia-container-toolkit

      zypper install nvidia-container-toolkit
      

      Test with podman

      zypper install podman podman-cni-config
      podman run nvidia/cuda nvidia-smi
      

      expected output:

      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |===============================+======================+======================|
      |   0  Quadro K2000        Off  | 00000000:05:00.0 Off |                  N/A |
      | 30%   43C    P0    N/A /  N/A |      0MiB /  1997MiB |      0%      Default |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                       GPU Memory |
      |  GPU       PID   Type   Process name                             Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      

      So it works! add-emoji

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      As a result, I updated the docs: https://github.com/jordimassaguerpla/SUSEhackweek18/commit/5fca6c12034b4df34c403f14276be754e809b086#diff-2df0241dfedf44f37dcafae751ab29ae

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      The previous link got broken ... damn markdown ;) docs

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      Upstream (NVIDIA) uses Dockerfiles to build the packages for the other distros.

      Here a small experiment of building the SUSE Leap RPM with a Dockerfile within OBS:

      https://build.opensuse.org/package/show/home:jordimassaguerpla:branches:openSUSE:Templates:Images:15.1/libnvidia-containers

    • jordimassaguerpla
      almost 6 years ago by jordimassaguerpla | Reply

      Result of the experiment. Using a Dockerfile works very good because you can develop and debug using "docker build" and then you can commit that to obs to have a build on a central location, store the sources, etc. etc.

      The issue is that the result is an image, it can't be the RPM. There is no "-v" option to mount a volume during the build. Thus, even you can build the image in obs, then you have to run the image to extract the RPM.

      obs=build.opensuse.org.

    Similar Projects

    Song Search with CLAP by gcolangiuli

    Description

    Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

    SUSE Hackweek AI Song Search

    Goals

    Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

    • Music Tagging;
    • Free text search;
    • Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

    The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

    Result

    In this MVP we implemented:

    • Async Song Analysis with Clap model
    • Free Text Search of the songs
    • Similar song search based on vector representation
    • Containerised version with web interface

    We also documented what went well and what can be improved in the use of AI.

    You can have a look at the result here:

    Future implementation can be related to performance improvement and stability of the analysis.

    References


    Help Create A Chat Control Resistant Turnkey Chatmail/Deltachat Relay Stack - Rootless Podman Compose, OpenSUSE BCI, Hardened, & SELinux by 3nd5h1771fy

    Description

    The Mission: Decentralized & Sovereign Messaging

    FYI: If you have never heard of "Chatmail", you can visit their site here, but simply put it can be thought of as the underlying protocol/platform decentralized messengers like DeltaChat use for their communications. Do not confuse it with the honeypot looking non-opensource paid for prodect with better seo that directs you to chatmailsecure(dot)com

    In an era of increasing centralized surveillance by unaccountable bad actors (aka BigTech), "Chat Control," and the erosion of digital privacy, the need for sovereign communication infrastructure is critical. Chatmail is a pioneering initiative that bridges the gap between classic email and modern instant messaging, offering metadata-minimized, end-to-end encrypted (E2EE) communication that is interoperable and open.

    However, unless you are a seasoned sysadmin, the current recommended deployment method of a Chatmail relay is rigid, fragile, difficult to properly secure, and effectively takes over the entire host the "relay" is deployed on.

    Why This Matters

    A simple, host agnostic, reproducible deployment lowers the entry cost for anyone wanting to run a privacy‑preserving, decentralized messaging relay. In an era of perpetually resurrected chat‑control legislation threats, EU digital‑sovereignty drives, and many dangers of using big‑tech messaging platforms (Apple iMessage, WhatsApp, FB Messenger, Instagram, SMS, Google Messages, etc...) for any type of communication, providing an easy‑to‑use alternative empowers:

    • Censorship resistance - No single entity controls the relay; operators can spin up new nodes quickly.
    • Surveillance mitigation - End‑to‑end OpenPGP encryption ensures relay operators never see plaintext.
    • Digital sovereignty - Communities can host their own infrastructure under local jurisdiction, aligning with national data‑policy goals.

    By turning the Chatmail relay into a plug‑and‑play container stack, we enable broader adoption, foster a resilient messaging fabric, and give developers, activists, and hobbyists a concrete tool to defend privacy online.

    Goals

    As I indicated earlier, this project aims to drastically simplify the deployment of Chatmail relay. By converting this architecture into a portable, containerized stack using Podman and OpenSUSE base container images, we can allow anyone to deploy their own censorship-resistant, privacy-preserving communications node in minutes.

    Our goal for Hack Week: package every component into containers built on openSUSE/MicroOS base images, initially orchestrated with a single container-compose.yml (podman-compose compatible). The stack will:

    • Run on any host that supports Podman (including optimizations and enhancements for SELinux‑enabled systems).
    • Allow network decoupling by refactoring configurations to move from file-system constrained Unix sockets to internal TCP networking, allowing containers achieve stricter isolation.
    • Utilize Enhanced Security with SELinux by using purpose built utilities such as udica we can quickly generate custom SELinux policies for the container stack, ensuring strict confinement superior to standard/typical Docker deployments.
    • Allow the use of bind or remote mounted volumes for shared data (/var/vmail, DKIM keys, TLS certs, etc.).
    • Replace the local DNS server requirement with a remote DNS‑provider API for DKIM/TXT record publishing.

    By delivering a turnkey, host agnostic, reproducible deployment, we lower the barrier for individuals and small communities to launch their own chatmail relays, fostering a decentralized, censorship‑resistant messaging ecosystem that can serve DeltaChat users and/or future services adopting this protocol

    Resources


    Rewrite Distrobox in go (POC) by fabriziosestito

    Description

    Rewriting Distrobox in Go.

    Main benefits:

    • Easier to maintain and to test
    • Adapter pattern for different container backends (LXC, systemd-nspawn, etc.)

    Goals

    • Build a minimal starting point with core commands
    • Keep the CLI interface compatible: existing users shouldn't notice any difference
    • Use a clean Go architecture with adapters for different container backends
    • Keep dependencies minimal and binary size small
    • Benchmark against the original shell script

    Resources

    • Upstream project: https://github.com/89luca89/distrobox/
    • Distrobox site: https://distrobox.it/
    • ArchWiki: https://wiki.archlinux.org/title/Distrobox


    Port the classic browser game HackTheNet to PHP 8 by dgedon

    Description

    The classic browser game HackTheNet from 2004 still runs on PHP 4/5 and MySQL 5 and needs a port to PHP 8 and e.g. MariaDB.

    Goals

    • Port the game to PHP 8 and MariaDB 11
    • Create a container where the game server can simply be started/stopped

    Resources

    • https://github.com/nodeg/hackthenet


    Technical talks at universities by agamez

    Description

    This project aims to empower the next generation of tech professionals by offering hands-on workshops on containerization and Kubernetes, with a strong focus on open-source technologies. By providing practical experience with these cutting-edge tools and fostering a deep understanding of open-source principles, we aim to bridge the gap between academia and industry.

    For now, the scope is limited to Spanish universities, since we already have the contacts and have started some conversations.

    Goals

    • Technical Skill Development: equip students with the fundamental knowledge and skills to build, deploy, and manage containerized applications using open-source tools like Kubernetes.
    • Open-Source Mindset: foster a passion for open-source software, encouraging students to contribute to open-source projects and collaborate with the global developer community.
    • Career Readiness: prepare students for industry-relevant roles by exposing them to real-world use cases, best practices, and open-source in companies.

    Resources

    • Instructors: experienced open-source professionals with deep knowledge of containerization and Kubernetes.
    • SUSE Expertise: leverage SUSE's expertise in open-source technologies to provide insights into industry trends and best practices.