Project Description

The purpose of this question is first to research, if we already have a python API available to be used in scripts to query status and details in the running cluster. If not already available the first purpose is to get a draft version of such an API. If it is already available the second purpose is to implement some useful examples for SAP workloads.

The API we are referring here is not the python API used by resource agents. But there are so much situation where you need to script with calling multiple crm* tools to figure out.

For SAP workloads (and we assume not only for SAP) it is helpful to have a stable way to figure out cluster information without calling multiple tools and parsing output and retuen-codes like done in bash scripts.

Goal for this Hackweek

TBD

Resources

TBD

Looking for hackers with the skills:

python3

This project is part of:

Hack Week 20

Activity

  • almost 4 years ago: AngelaBriel joined this project.
  • almost 4 years ago: fmherschel added keyword "python3" to this project.
  • almost 4 years ago: fmherschel liked this project.
  • almost 4 years ago: fmherschel started this project.
  • almost 4 years ago: fmherschel originated this project.

  • Comments

    • pagarcia
      almost 4 years ago by pagarcia | Reply

      Uyuni implemented the concept of cluster last year. Adding new cluster types is a matter of implement an API interface in a Salt module, which tells Uyuni how to work with that kind of cluster, e. g. do not reboot all the nodes at once, update servers sequentially, drain nodes before updating and/or rebooting, etc.

      Here are the details: https://github.com/uyuni-project/uyuni/wiki/Cluster-Provider-development

    Similar Projects

    Selenium with Python by xguo

    Description

    Try to create test case about Selenium base on Python

    Goals

    • Knowledge about Selenium with Python
    • Create new test case about Selenium

    Resources

    https://selenium-python.readthedocs.io/ https://www.selenium.dev/


    Enhance UV openQA helper script by mdonis

    Description

    A couple months ago an UV openQA helper script was created to help/automate the searching phase inside openQA for a given MU to test. The script searches inside all our openQA job groups (qam-sle) related with a given MU and generates an output suitable to add (copy & paste) inside the update log.

    This is still a WIP and could use some enhancements.

    Goals

    • Move script from bash to python: this would be useful in case we want to include this into MTUI in the future. The script will be separate from MTUI for now. The idea is to have this as a CLI tool using the click library or something similar.
    • Add option to look for jobs in other sections inside aggregated updates: right now, when looking for regression tests under aggregated updates for a given MU, the script only looks inside the Core MU job group. This is where most of the regression tests we need are located, but some MUs have their regression tests under the YaST/Containers/Security MU job groups. We should keep the Core MU group as a default, but add an option to be able to look into other job groups under aggregated updates.
    • Remove the -a option: this option is used to indicate the update ID and is mandatory right now. This is a bit weird and goes against posix stardards. It was developed this way in order to avoid using positional parameters. This problem should be fixed if we move the script to python.

    Some other ideas to consider:

    • Look into the QAM dashboard API. This has more info on each MU, could use this to link general openQA build results, whether the related RR is approved or not, etc
    • Make it easier to see if there's regression tests for a package in an openQA test build. Check if there's a possibility to search for tests that have the package name in them inside each testsuite.
    • Unit testing?

    More ideas TBD

    Resources

    https://github.com/os-autoinst/scripts/blob/master/openqa-search-maintenance-core-jobs

    https://confluence.suse.com/display/maintenanceqa/Guide+on+how+to+test+Updates

    Post-Hackweek update

    All major features were implemented. Unit tests are still in progress, and project will be moved to the SUSE github org once everything's done. https://github.com/mjdonis/oqa-search


    Symbol Relations by hli

    Description

    There are tools to build function call graphs based on parsing source code, for example, cscope.

    This project aims to achieve a similar goal by directly parsing the disasembly (i.e. objdump) of a compiled binary. The assembly code is what the CPU sees, therefore more "direct". This may be useful in certain scenarios, such as gdb/crash debugging.

    Detailed description and Demos can be found in the README file:

    Supports x86 for now (because my customers only use x86 machines), but support for other architectures can be added easily.

    Tested with python3.6

    Goals

    Any comments are welcome.

    Resources

    https://github.com/lhb-cafe/SymbolRelations

    symrellib.py: mplements the symbol relation graph and the disassembly parser

    symrel_tracer*.py: implements tracing (-t option)

    symrel.py: "cli parser"


    Small healthcheck tool for Longhorn by mbrookhuis

    Project Description

    We have often problems (e.g. pods not starting) that are related to PVCs not running, cluster (nodes) not all up or deployments not running or completely running. This all prevents administration activities. Having something that can regular be run to validate the status of the cluster would be helpful, and not as of today do a lot of manual tasks.

    As addition (read enough time), we could add changing reservation, adding new disks, etc. --> This didn't made it. But the scripts can easily be adopted.

    This tool would decrease troubleshooting time, giving admins rights to the rancher GUI and could be used in automation.

    Goal for this Hackweek

    At the end we should have a small python tool that is doing a (very) basic health check on nodes, deployments and PVCs. First attempt was to make it in golang, but that was taking to much time.

    Overview

    This tool will run a simple healthcheck on a kubernetes cluster. It will perform the following actions:

    • node check: This will check all nodes, and display the status and the k3s version. If the status of the nodes is not "Ready" (this should be only reported), the cluster will be reported as having problems

    • deployment check: This check will list all deployments, and display the number of expected replicas and the used replica. If there are unused replicas this will be displayed. The cluster will be reported as having problems.

    • pvc check: This check will list of all pvc's, and display the status and the robustness. If the robustness is not "Healthy", the cluster will be reported as having problems.

    If there is a problem registered in the checks, there will be a warning that the cluster is not healthy and the program will exit with 1.

    The script has 1 mandatory parameter and that is the kubeconf of the cluster or of a node off the cluster.

    The code is writen for Python 3.11, but will also work on 3.6 (the default with SLES15.x). There is a venv present that will contain all needed packages. Also, the script can be run on the cluster itself or any other linux server.

    Installation

    To install this project, perform the following steps:

    • Create the directory /opt/k8s-check

    mkdir /opt/k8s-check

    • Copy all the file to this directory and make the following changes:

    chmod +x k8s-check.py