The supportconfig tool is a great resource for troubleshooting common system issues on SLES but its functionalities might not be enough to troubleshoot other issues related to cloud solutions. I would like to invite you to contribute on this project by creating new plugins/tools to complement supportconfig's great power and ease the troubleshooting process for SUSE Openstack Cloud product.

Main goal:

This project will be considered as "successful" if we are able to develop and include on the main supportconfig tool, the new features listed below:

  • Develop some sort of "hb_report" tool for cloud where these could be included:

    • Structure the information collected in a better directory structure (directories and subdirectories instead of a huge unique file containing everything). We have some "splitter" tools, which recreate the original directory structure on the server (scsplitter.py) but it would be interesting to make this split structure the default one.
    • Include a way to "Trim" or "Toggle" the supportconfig to get the information relevant only to errors that occurred on specific components or dates. This way we would avoid having huge files containing data we don't necessarily need. The idea is to have a nice and easy way how to filter information - by instance id, request id, timestamp or any other attribute added to the "supportconfig" command
    • Include commands like "openstack (...) list" and "openstack (...) show $id"
    • HA-specific checks (pacemaker and pacemaker-remote if any)
    • Services report (up or on error state) - checking status from openstack command, from systemctl status and resource status in cluster; I had a case where a neutron agent(if I remember correctly) was in down ":-(" status while systemctl and crm_mon reported service is up and running
    • Database dump
    • Switch selected component to debug mode and collects logs from customer actions
    • Collect storage background and configuration
    • Query API's and generate a report on the activities/request
    • Ping endpoints and resolve hostnames as a check
    • Adding /var/lib/neutron to supportconfig (Bogdano in Rocket Chat)

Optional Goals:

  • A tree-like graphical tool (or ASCII art) that shows the complete infrastructure and allows to break each node by component/service then to review config/logs

  • Getting info from supportconfig as part of "Best Practice" document.

  • Compare Versions: Versions in support config against current versions in the SCC repos

Currently identified tools which could be included:

Looking for hackers with the skills:

bash python

This project is part of:

Hack Week 17

Activity

  • over 5 years ago: sandonov joined this project.
  • over 5 years ago: calmeidadeoliveira liked this project.
  • over 5 years ago: vuntz liked this project.
  • almost 6 years ago: aspiers joined this project.
  • almost 6 years ago: aspiers liked this project.
  • almost 6 years ago: barendartchuk joined this project.
  • almost 6 years ago: rsimai liked this project.
  • almost 6 years ago: pedrivo started this project.
  • almost 6 years ago: pedrivo added keyword "bash" to this project.
  • almost 6 years ago: pedrivo added keyword "python" to this project.
  • almost 6 years ago: pedrivo originated this project.

  • Comments

    • aspiers
      almost 6 years ago by aspiers | Reply

      Please see https://github.com/aspiers/SUSE-dist/tree/master/bin for several other tools in this space. Unfortunately I will be away on FTO for this hackweek but it would be good to share my thoughts and maybe demo everything I have built before I leave (end of next week).

    • sandonov
    • sandonov
      over 5 years ago by sandonov | Reply

      correct_link