Introduction

TensorFlow™ is an open-source software library for Machine Intelligence written on Python. It was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. (https://www.tensorflow.org/)

Using values recorded by SUSE Manager it should be possible to predict the outcome of certain operations if machine learning is applied. We are especially interested in the time it takes to apply patches to systems. With anecdotal values a neural network should be trained to predict this for future operations. We need do find out which values can and should be provided, which classifier(s) to use, aso.

Goals:

  • Monday:

    • Learn about Tensorflow: Definitions, how to create a model, different frameworks, etc
    • Define set of features that can be gathered from the SUSE Manager DB to create our dataset.
    • Explore the values of the dataset: Know about min-max values, boundaries, type of data (categorical, continuous).
    • Define crossed relation between data (crossed columns).
    • Is our dataset good enough?
  • Tuesday:

    • Create and test different tensorflow models: DNNCombinedLinearClassifier, DNNClassifier, etc
    • Are those models' estimations good enough?
    • Is tensorflow suitable for achiving the project goal? are estimation good enough for us?
    • Upload working example.

Outcomes:

  • Initial dataset was not really good. We modified the SQL query to collect also package ids.
  • In the past we restricted the dataset to only contain actions for erratas which only contains one package, but the resulting dataset was not big enough.
  • We implemented a DNNRegressor.
  • Dataset: COLUMNS = ["server_id","errata_id","nrcpu","mhz","ram","package_id","size","time"] (we only currently use server_id, errata_id, package_id)
  • Currently the dataset is based patch installation actions which contains only a one single errata but this errata can have multiple packages associated.
  • We don't know the installation time for a package, because the "time" data we have is for the complete action, so we do a very draft estimation just dividing the total time by the number of packages the errata contains.
  • Estimations seems to be good enough, of course, the database still needs to be improved as well as the model itself where the feature columns definition can be adjusted to get better results.
  • Current estimations are good enough to, at least, give an estimation saying if the action you're planning is going to take less than ~10 seconds, ~30 seconds, ~1 minute, ~5 minutes, etc.

Some samples of estimations:

expected -> estimated

0.233874837557475 -> 0.230502188205719
0.233874837557475 -> 0.25423765182495117
0.233874837557475 -> 0.1823016107082367
0.979458148662861 -> 0.8299890756607056
0.979458148662861 -> 0.8462812900543213
0.211660345395406 -> 0.22346541285514832
1.70577935377757 -> 1.9606330394744873
2.60000002384186 -> 2.39455509185791
0.976182460784912 -> 0.1866598129272461
0.976182460784912 -> 0.614652693271637
2.80241966247559 -> 1.0975050926208496
0.6621074676513671 -> 0.6865990161895752
0.0968895809991019 -> 0.041620612144470215
0.0968895809991019 -> 0.1236574649810791
0.0968895809991019 -> 0.05707252025604248
1.3669094741344499 -> 2.2393956184387207
1.3669094741344499 -> 2.2393956184387207

"Actual" vs "Predicted" screenshots:

Screenshot1

Full graph: view full graph here

Next steps:

  • Refinement of model and dataset
  • Add actions with multiple errata to the dataset
  • Implement also a DNNClassifier to directly classifing instead of getting a float number (possible classes: seconds, minutes, hours).
  • POC of integration with the SUSE Manager UI
  • Refeed the neural network with the actual results of the new actions on SUSE Manager.
  • Replace package_id with something consistent across customers (eg: package name)
  • Try to find a way to avoid averaging the time per package on erratas that point to multiple packages
  • Estimate the actual action (not per package)

Code repository: Internal GitLab

Looking for hackers with the skills:

tensorflow python machinelearning susemanager

This project is part of:

Hack Week 16

Activity

  • over 7 years ago: bfilho liked this project.
  • about 8 years ago: j_renner liked this project.
  • about 8 years ago: PSuarezHernandez added keyword "tensorflow" to this project.
  • about 8 years ago: PSuarezHernandez added keyword "python" to this project.
  • about 8 years ago: PSuarezHernandez added keyword "machinelearning" to this project.
  • about 8 years ago: PSuarezHernandez added keyword "susemanager" to this project.
  • about 8 years ago: mdinca liked this project.
  • about 8 years ago: dmaiocchi liked this project.
  • about 8 years ago: dmaiocchi disliked this project.
  • about 8 years ago: dmaiocchi liked this project.
  • about 8 years ago: mdinca joined this project.
  • about 8 years ago: PSuarezHernandez liked this project.
  • about 8 years ago: jochenbreuer joined this project.
  • about 8 years ago: PSuarezHernandez started this project.
  • about 8 years ago: PSuarezHernandez originated this project.

  • Comments

    • PSuarezHernandez
      about 8 years ago by PSuarezHernandez | Reply

      The outcomes from this HW project has been published!! The project page has been updated to include the results!

    Similar Projects

    Improve/rework household chore tracker `chorazon` by gniebler

    Description

    I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

    It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

    There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

    Days are not rolled over automatically, to allow for task completion control.

    We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

    It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

    Goals

    • Add admin screens for users, tasks and schedules
    • Add models, pages etc. to allow redeeming tokens for gifts/surprises
    • …?

    Resources

    tbd (Gitlab repo)


    Bring to Cockpit + System Roles capabilities from YAST by miguelpc

    Bring to Cockpit + System Roles features from YAST

    Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.

    Goals

    The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.

    Resources

    A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit

    Linux System Roles: https://linux-system-roles.github.io/

    First meeting Hackweek catchup


    Improvements to osc (especially with regards to the Git workflow) by mcepl

    Description

    There is plenty of hacking on osc, where we could spent some fun time. I would like to see a solution for https://github.com/openSUSE/osc/issues/2006 (which is sufficiently non-serious, that it could be part of HackWeek project).


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

    Description

    git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

    Supports:

    • Verifying commit authenticity signed using gpg key
    • Checking out trusted commits

    Ideal for teams and projects where the integrity of git history is crucial.

    Goals

    A minimal python code of the shell script exists as a pull request.

    The goal of this hackweek is to:

    • Add more unit tests
    • Make the python code modular
    • Add code coverage if possible

    Resources


    Song Search with CLAP by gcolangiuli

    Description

    Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

    SUSE Hackweek AI Song Search

    Goals

    Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

    • Music Tagging;
    • Free text search;
    • Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

    The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

    Resources


    Testing and adding GNU/Linux distributions on Uyuni by juliogonzalezgil

    Join the Gitter channel! https://gitter.im/uyuni-project/hackweek

    Uyuni is a configuration and infrastructure management tool that saves you time and headaches when you have to manage and update tens, hundreds or even thousands of machines. It also manages configuration, can run audits, build image containers, monitor and much more!

    Currently there are a few distributions that are completely untested on Uyuni or SUSE Manager (AFAIK) or just not tested since a long time, and could be interesting knowing how hard would be working with them and, if possible, fix whatever is broken.

    For newcomers, the easiest distributions are those based on DEB or RPM packages. Distributions with other package formats are doable, but will require adapting the Python and Java code to be able to sync and analyze such packages (and if salt does not support those packages, it will need changes as well). So if you want a distribution with other packages, make sure you are comfortable handling such changes.

    No developer experience? No worries! We had non-developers contributors in the past, and we are ready to help as long as you are willing to learn. If you don't want to code at all, you can also help us preparing the documentation after someone else has the initial code ready, or you could also help with testing :-)

    The idea is testing Salt and Salt-ssh clients, but NOT traditional clients, which are deprecated.

    To consider that a distribution has basic support, we should cover at least (points 3-6 are to be tested for both salt minions and salt ssh minions):

    1. Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    2. Onboarding (salt minion from UI, salt minion from bootstrap scritp, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    3. Package management (install, remove, update...)
    4. Patching
    5. Applying any basic salt state (including a formula)
    6. Salt remote commands
    7. Bonus point: Java part for product identification, and monitoring enablement
    8. Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    9. Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)
    10. Bonus point: testsuite enablement (https://github.com/uyuni-project/uyuni/tree/master/testsuite)

    If something is breaking: we can try to fix it, but the main idea is research how supported it is right now. Beyond that it's up to each project member how much to hack :-)

    • If you don't have knowledge about some of the steps: ask the team
    • If you still don't know what to do: switch to another distribution and keep testing.

    This card is for EVERYONE, not just developers. Seriously! We had people from other teams helping that were not developers, and added support for Debian and new SUSE Linux Enterprise and openSUSE Leap versions :-)

    Pending

    Debian 13

    The new version of the beloved Debian GNU/Linux OS

    Seems to be a Debian 12 derivative, so adding it could be quite easy.

    • [ ] Reposync (this will require using spacewalk-common-channels and adding channels to the .ini file)
    • W] Onboarding (salt minion from UI, salt minion from bootstrap script, and salt-ssh minion) (this will probably require adding OS to the bootstrap repository creator)
    • [ ] Package management (install, remove, update...)
    • [ ] Patching (if patch information is available, could require writing some code to parse it, but IIRC we have support for Ubuntu already). Probably not for Debian as IIRC we don't support patches yet.
    • [ ] Applying any basic salt state (including a formula)
    • [ ] Salt remote commands
    • [ ] Bonus point: Java part for product identification, and monitoring enablement
    • [ ] Bonus point: sumaform enablement (https://github.com/uyuni-project/sumaform)
    • [ ] Bonus point: Documentation (https://github.com/uyuni-project/uyuni-docs)