The supportconfig tool is a great resource for troubleshooting common system issues on SLES but its functionalities might not be enough to troubleshoot other issues related to cloud solutions. I would like to invite you to contribute on this project by creating new plugins/tools to complement supportconfig's great power and ease the troubleshooting process for SUSE Openstack Cloud product.

Main goal:

This project will be considered as "successful" if we are able to develop and include on the main supportconfig tool, the new features listed below:

  • Develop some sort of "hb_report" tool for cloud where these could be included:

    • Structure the information collected in a better directory structure (directories and subdirectories instead of a huge unique file containing everything). We have some "splitter" tools, which recreate the original directory structure on the server (scsplitter.py) but it would be interesting to make this split structure the default one.
    • Include a way to "Trim" or "Toggle" the supportconfig to get the information relevant only to errors that occurred on specific components or dates. This way we would avoid having huge files containing data we don't necessarily need. The idea is to have a nice and easy way how to filter information - by instance id, request id, timestamp or any other attribute added to the "supportconfig" command
    • Include commands like "openstack (...) list" and "openstack (...) show $id"
    • HA-specific checks (pacemaker and pacemaker-remote if any)
    • Services report (up or on error state) - checking status from openstack command, from systemctl status and resource status in cluster; I had a case where a neutron agent(if I remember correctly) was in down ":-(" status while systemctl and crm_mon reported service is up and running
    • Database dump
    • Switch selected component to debug mode and collects logs from customer actions
    • Collect storage background and configuration
    • Query API's and generate a report on the activities/request
    • Ping endpoints and resolve hostnames as a check
    • Adding /var/lib/neutron to supportconfig (Bogdano in Rocket Chat)

Optional Goals:

  • A tree-like graphical tool (or ASCII art) that shows the complete infrastructure and allows to break each node by component/service then to review config/logs

  • Getting info from supportconfig as part of "Best Practice" document.

  • Compare Versions: Versions in support config against current versions in the SCC repos

Currently identified tools which could be included:

  • SOSREPORT: https://github.com/sosreport/sos: Sos is an extensible, portable, support data collection tool primarily aimed at Linux distributions and other UNIX-like operating systems. Perhaps consider a well-established tool with plugins for every possible situation before implementing our own bicycle

  • https://github.com/search?utf8=%E2%9C%93&q=supportconfig&type=

  • ELK Tool: https://github.com/denisok/elk_supportconfig

  • Support Config Utils from A. Spiers: https://build.opensuse.org/package/show/home:aspiers/supportconfig-utils

  • Crowbar Macs: https://github.com/aspiers/SUSE-dist/blob/master/bin/crowbar-macs

  • scsplitter (no link known)

  • lnav monitoring: https://software.opensuse.org/download.html?project=server:monitoring&package=lnav

Looking for hackers with the skills:

bash python

This project is part of:

Hack Week 17

Activity

  • over 7 years ago: sandonov joined this project.
  • over 7 years ago: calmeidadeoliveira liked this project.
  • over 7 years ago: vuntz liked this project.
  • over 7 years ago: aspiers joined this project.
  • over 7 years ago: aspiers liked this project.
  • over 7 years ago: barendartchuk joined this project.
  • over 7 years ago: rsimai liked this project.
  • over 7 years ago: pedrivo started this project.
  • over 7 years ago: pedrivo added keyword "bash" to this project.
  • over 7 years ago: pedrivo added keyword "python" to this project.
  • over 7 years ago: pedrivo originated this project.

  • Comments

    • aspiers
      over 7 years ago by aspiers | Reply

      Please see https://github.com/aspiers/SUSE-dist/tree/master/bin for several other tools in this space. Unfortunately I will be away on FTO for this hackweek but it would be good to share my thoughts and maybe demo everything I have built before I leave (end of next week).

    • sandonov
      over 7 years ago by sandonov | Reply

      DB dump plugin for supportconfig https://github.com/sandonovsuse/supportutils-plugin-suse-openstack-cloud/blob/master/suseopenstackclouddatabasedump

    • sandonov
      over 7 years ago by sandonov | Reply

      correct_link

    Similar Projects

    OS self documentation, health check and troubleshooting by roseswe

    Project Description

    The aim of this hackweek project is to improve the utility "cfg2html" so that it is even more usable under SLES and perhaps also under Rancher.

    cfg2html (see also https://github.com/cfg2html/cfg2html) itself is a very mature utility for collecting and documenting information of an operating system like Linux, AIX, HP-UX and others.

    Goal for this Hackweek

    The aim is to extend cfg2html

    • for SLES and SLES-for-SAP apps, high availability
    • Improve code for MicroOS 5.x, SUMA, Edge and k8s environments
    • fix shellbeauity warnings
    • possibly add more plugins
    • SUMA/Salt integration to collect.

    Resources

    Required skills: Bash, shell script and the SUSE products mentioned.

    https://github.com/cfg2html/cfg2html

    https://www.cfg2html.com/


    SUSE Health Check Tools by roseswe

    SUSE HC Tools Overview

    A collection of tools written in Bash or Go 1.24++ to make life easier with handling of a bunch of tar.xz balls created by supportconfig.

    Background: For SUSE HC we receive a bunch of supportconfig tar balls to check them for misconfiguration, areas for improvement or future changes.

    Main focus on these HC are High Availability (pacemaker), SLES itself and SAP workloads, esp. around the SUSE best practices.

    Goals

    • Overall improvement of the tools
    • Adding new collectors
    • Add support for SLES16

    Resources

    csv2xls* example.sh go.mod listprodids.txt sumtext* trails.go README.md csv2xls.go exceltest.go go.sum m.sh* sumtext.go vercheck.py* config.ini csvfiles/ getrpm* listprodids* rpmdate.sh* sumxls* verdriver* credtest.go example.py getrpm.go listprodids.go sccfixer.sh* sumxls.go verdriver.go

    docollall.sh* extracthtml.go gethostnamectl* go.sum numastat.go cpuvul* extractcluster.go firmwarebug* gethostnamectl.go m.sh* numastattest.go cpuvul.go extracthtml* firmwarebug.go go.mod numastat* xtr_cib.sh*


    Bring to Cockpit + System Roles capabilities from YAST by miguelpc

    Bring to Cockpit + System Roles features from YAST

    Cockpit and System Roles have been added to SLES 16 There are several capabilities in YAST that are not yet present in Cockpit and System Roles We will follow the principle of "automate first, UI later" being System Roles the automation component and Cockpit the UI one.

    Goals

    The idea is to implement service configuration in System Roles and then add an UI to manage these in Cockpit. For some capabilities it will be required to have an specific Cockpit Module as they will interact with a reasource already configured.

    Resources

    A plan on capabilities missing and suggested implementation is available here: https://docs.google.com/spreadsheets/d/1ZhX-Ip9MKJNeKSYV3bSZG4Qc5giuY7XSV0U61Ecu9lo/edit

    Linux System Roles: https://linux-system-roles.github.io/