an invention by PSuarezHernandez
Introduction
TensorFlow™ is an open-source software library for Machine Intelligence written on Python. It was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. (https://www.tensorflow.org/)
Using values recorded by SUSE Manager it should be possible to predict the outcome of certain operations if machine learning is applied. We are especially interested in the time it takes to apply patches to systems. With anecdotal values a neural network should be trained to predict this for future operations. We need do find out which values can and should be provided, which classifier(s) to use, aso.
Goals:
Monday:
- Learn about Tensorflow: Definitions, how to create a model, different frameworks, etc
- Define set of features that can be gathered from the SUSE Manager DB to create our dataset.
- Explore the values of the dataset: Know about min-max values, boundaries, type of data (categorical, continuous).
- Define crossed relation between data (crossed columns).
- Is our dataset good enough?
Tuesday:
- Create and test different tensorflow models: DNNCombinedLinearClassifier, DNNClassifier, etc
- Are those models' estimations good enough?
- Is tensorflow suitable for achiving the project goal? are estimation good enough for us?
- Upload working example.
Outcomes:
- Initial dataset was not really good. We modified the SQL query to collect also package ids.
- In the past we restricted the dataset to only contain actions for erratas which only contains one package, but the resulting dataset was not big enough.
- We implemented a DNNRegressor.
- Dataset:
COLUMNS = ["server_id","errata_id","nrcpu","mhz","ram","package_id","size","time"](we only currently use server_id, errata_id, package_id) - Currently the dataset is based patch installation actions which contains only a one single errata but this errata can have multiple packages associated.
- We don't know the installation time for a package, because the "time" data we have is for the complete action, so we do a very draft estimation just dividing the total time by the number of packages the errata contains.
- Estimations seems to be good enough, of course, the database still needs to be improved as well as the model itself where the feature columns definition can be adjusted to get better results.
- Current estimations are good enough to, at least, give an estimation saying if the action you're planning is going to take less than ~10 seconds, ~30 seconds, ~1 minute, ~5 minutes, etc.
Some samples of estimations:
expected -> estimated
0.233874837557475 -> 0.230502188205719
0.233874837557475 -> 0.25423765182495117
0.233874837557475 -> 0.1823016107082367
0.979458148662861 -> 0.8299890756607056
0.979458148662861 -> 0.8462812900543213
0.211660345395406 -> 0.22346541285514832
1.70577935377757 -> 1.9606330394744873
2.60000002384186 -> 2.39455509185791
0.976182460784912 -> 0.1866598129272461
0.976182460784912 -> 0.614652693271637
2.80241966247559 -> 1.0975050926208496
0.6621074676513671 -> 0.6865990161895752
0.0968895809991019 -> 0.041620612144470215
0.0968895809991019 -> 0.1236574649810791
0.0968895809991019 -> 0.05707252025604248
1.3669094741344499 -> 2.2393956184387207
1.3669094741344499 -> 2.2393956184387207
"Actual" vs "Predicted" screenshots:

Full graph: view full graph here
Next steps:
- Refinement of model and dataset
- Add actions with multiple errata to the dataset
- Implement also a DNNClassifier to directly classifing instead of getting a float number (possible classes: seconds, minutes, hours).
- POC of integration with the SUSE Manager UI
- Refeed the neural network with the actual results of the new actions on SUSE Manager.
- Replace package_id with something consistent across customers (eg: package name)
- Try to find a way to avoid averaging the time per package on erratas that point to multiple packages
- Estimate the actual action (not per package)
Code repository: Internal GitLab
Looking for hackers with the skills:
This project is part of:
Hack Week 16
Activity
Comments
-
about 8 years ago by PSuarezHernandez | Reply
The outcomes from this HW project has been published!! The project page has been updated to include the results!
Similar Projects
Bring to Cockpit + System Roles capabilities from YAST by miguelpc
Bring to Cockpit + System Roles features from YAST
Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.
Goals
The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.
Resources
A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit
Linux System Roles: https://linux-system-roles.github.io/
Improve/rework household chore tracker `chorazon` by gniebler
Description
I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.
It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.
There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)
Days are not rolled over automatically, to allow for task completion control.
We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.
It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.
Goals
- Add admin screens for users, tasks and schedules
- Add models, pages etc. to allow redeeming tokens for gifts/surprises
- …?
Resources
tbd (Gitlab repo)
Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak
Description
git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.
Supports:
- Verifying commit authenticity signed using gpg key
- Checking out trusted commits
Ideal for teams and projects where the integrity of git history is crucial.
Goals
A minimal python code of the shell script exists as a pull request.
The goal of this hackweek is to:
- Add more unit tests
- Make the python code modular
- Add code coverage if possible
Resources
- Link to GitHub Repository: https://github.com/openSUSE/git-sha-verify
Update M2Crypto by mcepl
There are couple of projects I work on, which need my attention and putting them to shape:
Goal for this Hackweek
- Put M2Crypto into better shape (most issues closed, all pull requests processed)
- More fun to learn jujutsu
- Play more with Gemini, how much it help (or not).
- Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.
Improve chore and screen time doc generator script `wochenplaner` by gniebler
Description
I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.
I named this script wochenplaner and have been using it for a few months now.
It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.
Goals
- Fix chore field separation lines
- Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
- Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.
Resources
tbd (Gitlab repo)