SUSE Hack Week: Anomaly analyser, predictor for kubernetes(Rancher)

Project Description

Nowadays most customers are looking for multi-cloud and container solutions. The main critical point for their business is providing a better service and make the customer happy. The efficiency of the IT Ops team key to the superior customer experience. In most case customers reports the issue and support will fix the issue but support is not aware of the problems (like node failures, resource crunch limits) in the multi-container environment until customers report them. Even though monitoring and alerts systems exist in the current market that only provide alerts when an issue occurs BUT we need smarter solutions to analyze existing systems and predict future anomalies.

The proposed system will do:

Data collection (unstructured data) from k8s components across the environments
Identifies the common pattern happens in the failure cases.
Creates a Knowledge base for the identified patterns with related components . (Structured data)
Uses a specific data model for the prediction
Use the output from data model to predict the analysis.
Send the alerts and reports

This is further classified as 3 main components in the proposed architecture:

Data collection
Data Prediction
Alers & Reports

Resources that can be considered for the analysis and prediction:   Storage devices- Capacity, State Network devices ( LB, Firewalls)- Like Link status , Packet drops Compute Nodes: CPU,Memory,I/O, Storage

Solution Approach: -- Create data model -- Scan & Filter Data -- Extract Entity -- Annotate Data and Input to Model -- Process Output from Model -- Notify / Recommend / Self Heal

Goal for this Hackweek

Use existing log collector to collect the data from rancher k8s clusters and come up with a appropriate data model.

https://support.rancher.com/hc/en-us/articles/360039113911-The-Rancher-v2-x-log-collector-script

Resources

ML engineer,

ML, Python, kubernetes, data model, monitoring tools. @

No Hackers yet

Join this project Leave this project

Looking for hackers with the skills:

python3 machinelearning

This project is part of:

Hack Week 20

Activity

almost 5 years ago: sbabusadhu added keyword "python3" to this project.

almost 5 years ago: sbabusadhu added keyword "machinelearning" to this project.

almost 5 years ago: sbabusadhu added keyword "python3" to this project.

almost 5 years ago: sbabusadhu added keyword "machinelearning" to this project.

almost 5 years ago: sbabusadhu originated this project.

Comments

Be the first to comment!

Similar Projects

python3

Improve/rework household chore tracker `chorazon` by gniebler

Description

I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

Days are not rolled over automatically, to allow for task completion control.

We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

Goals

Add admin screens for users, tasks and schedules
Add models, pages etc. to allow redeeming tokens for gifts/surprises
…?

Resources

tbd (Gitlab repo)

mgr-ansible-ssh - Intelligent, Lightweight CLI for Distributed Remote Execution by deve5h

Description

By the end of Hack Week, the target will be to deliver a minimal functional version 1 (MVP) of a custom command-line tool named mgr-ansible-ssh (a unified wrapper for BOTH ad-hoc shell & playbooks) that allows operators to:

Execute arbitrary shell commands on thousand of remote machines simultaneously using Ansible Runner with artifacts saved locally.
Pass runtime options such as inventory file, remote command string/ playbook execution, parallel forks, limits, dry-run mode, or no-std-ansible-output.
Leverage existing SSH trust relationships without additional setup.
Provide a clean, intuitive CLI interface with --help for ease of use. It should provide consistent UX & CI-friendly interface.
Establish a foundation that can later be extended with advanced features such as logging, grouping, interactive shell mode, safe-command checks, and parallel execution tuning.

The MVP should enable day-to-day operations to efficiently target thousands of machines with a single, consistent interface.

Goals

Primary Goals (MVP):

Build a functional CLI tool (mgr-ansible-ssh) capable of executing shell commands on multiple remote hosts using Ansible Runner. Test the tool across a large distributed environment (1000+ machines) to validate its performance and reliability.

Looking forward to significantly reducing the zypper deployment time across all 351 RMT VM servers in our MLM cluster by eliminating the dependency on the taskomatic service, bringing execution down to a fraction of the current duration. The tool should also support multiple runtime flags, such as:

mgr-ansible-ssh: Remote command execution wrapper using Ansible Runner

Usage: mgr-ansible-ssh [--help] [--version] [--inventory INVENTORY]
                   [--run RUN] [--playbook PLAYBOOK] [--limit LIMIT]
                   [--forks FORKS] [--dry-run] [--no-ansible-output]

Required Arguments
--inventory, -i      Path to Ansible inventory file to use

Any One of the Arguments Is Required
--run, -r            Execute the specified shell command on target hosts
--playbook, -p       Execute the specified Ansible playbook on target hosts

Optional Arguments
--help, -h           Show the help message and exit
--version, -v        Show the version and exit
--limit, -l          Limit execution to specific hosts or groups
--forks, -f          Number of parallel Ansible forks
--dry-run            Run in Ansible check mode (requires -p or --playbook)
--no-ansible-output  Suppress Ansible stdout output

Secondary/Stretched Goals (if time permits):

Add pretty output formatting (success/failure summary per host).
Implement basic logging of executed commands and results.
Introduce safety checks for risky commands (shutdown, rm -rf, etc.).
Package the tool so it can be installed with pip or stored internally.

Resources

Collaboration is welcome from anyone interested in CLI tooling, automation, or distributed systems. Skills that would be particularly valuable include:

Python especially around CLI dev (argparse, click, rich)

openQA log viewer by mpagot

Description

*** Warning: Are You at Risk for VOMIT? ***

Do you find yourself staring at a screen, your eyes glossing over as thousands of lines of text scroll by? Do you feel a wave of text-based nausea when someone asks you to "just check the logs"?

You may be suffering from VOMIT (Verbose Output Mental Irritation Toxicity).

This dangerous, work-induced ailment is triggered by exposure to an overwhelming quantity of log data, especially from parallel systems. The human brain, not designed to mentally process 12 simultaneous autoinst-log.txt files, enters a state of toxic shock. It rejects the "Verbose Output," making it impossible to find the one critical error line buried in a 50,000-line sea of "INFO: doing a thing."

Before you're forced to rm -rf /var/log in a fit of desperation, we present the digital antacid.

No panic: we have The openQA Log Visualizer

This is the UI antidote for handling toxic log environments. It bravely dives into the chaotic, multi-machine mess of your openQA test runs, finds all the related, verbose logs, and force-feeds them into a parser.

image

Goals

Work on the existing POC openqa-log-visualizer about few specific tasks:

add support for more type of logs
extend the configuration file syntax beyond the actual one
work on log parsing performance

Find some beta-tester and collect feedback and ideas about features

If time allow for it evaluate other UI frameworks and solutions (something more simple to distribute and run, maybe more low level to gain in performance).

Resources

openqa-log-visualizer

Improve chore and screen time doc generator script `wochenplaner` by gniebler

Description

I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.

I named this script wochenplaner and have been using it for a few months now.

It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.

Goals

Fix chore field separation lines
Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.

Resources

tbd (Gitlab repo)

machinelearning

Song Search with CLAP by gcolangiuli

Description

Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

Goals

Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

Music Tagging;
Free text search;
Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

Result

In this MVP we implemented:

Async Song Analysis with Clap model
Free Text Search of the songs
Similar song search based on vector representation
Containerised version with web interface

We also documented what went well and what can be improved in the use of AI.

You can have a look at the result here:

Future implementation can be related to performance improvement and stability of the analysis.

References

CLAP: The main model being researched;
huggingface: Pre-trained models for CLAP;
Free Music Archive: Creative Commons songs that can be used for testing;