Project Description

The sar(1) tool, from the openSUSE package "sysstat", provides a comprehensive method for collecting performance data on a running system.

There isn't, however, a satisfactory way to display historical sar data. It is not uncommon for users who experience a performance degradation to include a few days worth of sar archives in their bug reports. The experts looking into the report often resort to scan these archives using text-oriented utilities such as sed and awk, which may fail to reveal the story hidden behind the data.

We aim at devising a method and tool to visualize large historical sar datasets.

Goal for this Hackweek

Take one or two sample sar datasets and feed them into Grafana[LINK], the Perfetto trace viewer[LINK] and Performance Co-Pilot (PCP)[LINK]. The solution of choice will need to compare favorably with my two previous sar visualization attempts made with ad-hoc scripts: [LINK-1][LINK-2]. Evaluate the outcome, write a report and select one single tool to focus future efforts on.

Prior art

To the best of my knowledge, these are the methods currently available for plotting sar data:

  • kSar[MAYBE-LINK-1][MAYBE-LINK-2][MAYBE-LINK-3] Ksar is a self-contained Java GUI. This tool is advertised in our openSUSE Tuning Guide, chapter 2 "System monitoring utilities", section 2.1.3.2 "Visualizing sar data"[LINK] although we don't have a package for it. Its drawback is it can display a sar datafile at a time: as sar writes one datafile per day, to visualize a week worth of archives one has to have multiple kSar windows open. The resulting charts will have different Y axis scales, which makes them difficult to compare. You can find a set of kSar screenshots illustrating this usability problem here: [LINK](requires SUSE confluence login).
  • sadf -g sadf -g your_datafile [ -- sar_options ] > output.svg sadf(1) can emit svg files that can be viewed in web browsers. The same limitation of kSar applies here: multiple days require multiple plots.
  • sar2pcp[LINK] There exist a Performance Co-Pilot (PCP) plugin to import sar data, which we package in openSUSE as "pcp-import-sar2pcp". Last time I checked, sar2pcp had to be invoked with LD_PRELOAD=/usr/lib64/libpcp_import.so because that shared library wasn't correctly linked in the build of the package. That can easily be fixed, but it's unclear to me how well the sar + PCP combination works, as I haven't yet tried it. It could very well suffer from the one-day-per-plot limitation of the previous two tools. The feature is advertised in both sar and pcp's documentation:
    • sar: CTRL+F for "pcp" at the sar homepage[LINK]
    • PCP: "Visualizing iostat and sar data" at the PCP documentation[LINK]
  • sadf -j sadf -j your_datafile [ -- sar_options ] > output.json Converting sar data to json, then write an ad-hoc script to produce the charts. This is what I've done in the past, but writing special-purpose code every time takes energy away from actually analyzing the data and debugging the problem at hand. The experience was valuable though, as it showed what these charts should look like (see [LINK-1][LINK-2])

Tentative plan

It seems a sensible choice to leverage an existing graphing tool and make adjustments so that it's tailored at sar data (eg. create a plug-in or a new plot type within an existing framework). Candidates are Grafana (web tool), the Perfetto trace visualizer (web tool) and Performance Co-Pilot (desktop app). I'm slightly biased towards interactive charts as opposed to static images since zooming in/out and moving around the data range may help exploring the dataset. Interactivity is best achieved with web based tools, which offer the possibility of sharing access to the visualization with a URL, without the need for the recipient to install new software locally.

This project is part of:

Hack Week 22 Hack Week 23

Activity

  • almost 2 years ago: mkoutny liked this project.
  • almost 2 years ago: ggherdovich added keyword "observability" to this project.
  • almost 2 years ago: ggherdovich removed keyword observeability from this project.
  • almost 2 years ago: ggherdovich added keyword "sar" to this project.
  • almost 2 years ago: ggherdovich added keyword "observeability" to this project.
  • almost 2 years ago: ggherdovich added keyword "visualization" to this project.
  • almost 2 years ago: ggherdovich added keyword "monitoring" to this project.
  • almost 2 years ago: ggherdovich added keyword "grafana" to this project.
  • almost 2 years ago: ggherdovich added keyword "performance" to this project.
  • almost 2 years ago: ggherdovich added keyword "perfetto" to this project.
  • almost 2 years ago: ggherdovich added keyword "performance-co-pilot" to this project.
  • almost 2 years ago: ggherdovich added keyword "pcp" to this project.
  • almost 2 years ago: ggherdovich started this project.
  • almost 2 years ago: ggherdovich originated this project.

  • Comments

    • heikkiyp
      almost 2 years ago by heikkiyp | Reply

      And to add additional issue for the PCP .. On TW the libpcpimport1 package does not create the link from libpcpimport.so.1 file to libpcp_import.so Also the perl-XML-TokeParser dependency was missing .

    • ggherdovich
      almost 2 years ago by ggherdovich | Reply

      Hello Heikki, thanks for commenting. What I'm taking from the status of sar+pcp interoperability in openSUSE is that no-one has ever used it. The problem you mention, plus the ones I already knew about, are failures that you'd notice immediately as you launch the tool. So on one side those are fixable with one-line edits in the spec file of the package, but on the other one they show that in essence the sar+pcp combo is uncharted territory, in my estimation.

    • heikkiyp
      almost 2 years ago by heikkiyp | Reply

      Also noted that with later sysstat version you can convert to pcp supported format directly without the need of sar2pcp tool . The newer syststat is not available for SLE12 branch .. You have to use SLE15 or Leap or TW version . Sa files from SLE12 branch needs conversion anyway as sar2pcp will complain the format .

      Most usable way to use pcp is on SLE15 or TW or LEAP . Convert the sa files with :

      sadf -l -O pcparchive=sample02 sa20230125 -- -A

      Style command and then using pmchart tool ..

      Combining is easy .. pmlogextract sample01 sample02 looongsample

    • ggherdovich
      almost 2 years ago by ggherdovich | Reply

      That's fantastic information Heikki! Thanks for sharing! I haven't tried the sar->pcp conversion technique you suggest (I'll do it soon), but I imagine the limitation you mention (sadf must be from SLE-15 or later) is not so severe, because I suspect the user can still collect sar data using a SLE-12 sar daemon, it's only the person analyzing the data that must have a recent sysstat to do the format conversion. I'd expect the new sadf can still read old archives. Again, thanks for pointing this out!

    • paolodepa
      about 1 year ago by paolodepa | Reply

      Hey Giovanni, have a look at https://github.com/sargraph/sargraph.github.io. They also have an online version: https://sarchart.dotsuresh.com/ but I'd suggest to use it only as a preview, and not to load personal (or customer's) data there.

    • ggherdovich
      about 1 year ago by ggherdovich | Reply

      Hi Paolo, thanks for the link. Looks like a node app that can be deployed on premises. I need to try hosting a local instance of that "sargraph" and see how it does on some sample datasets I have.

    Similar Projects

    Update my own python audio and video time-lapse and motion capture apps and publish by dmair

    Project Description

    Many years ago, in my own time, I wrote a Qt python application to periodically capture frames from a V4L2 video device (e.g. a webcam) and used it to create daily weather timelapse videos from windows at my home. I have maintained it at home in my own time and this year have added motion detection making it a functional video security tool but with no guarantees. I also wrote a linux audio monitoring app in python using Qt in my own time that captures live signal strength along with 24 hour history of audio signal level/range and audio spectrum. I recently added background noise filtering to the app. In due course I aim to include voice detection, currently I'm assuming via Google's public audio interface. Neither of these is a professional home security app but between them they permit a user to freely monitor video and audio data from a home in a manageable way. Both projects are on github but out-of-date with personal work, I would like to organize and update the github versions of these projects.

    Goal for this Hackweek

    It would probably help to migrate all the v4l2py module based video code to linuxpy.video based code and that looks like a re-write of large areas of the video code. It would also be good to remove a lot of python lint that is several years old to improve the projects with the main goal being to push the recent changes with better organized code to github. If there is enough time I'd like to take the in-line Qt QSettings persistent state code used per-app and write a python class that encapsulates the Qt QSettings class in a value_of(name)/name=value manner for shared use in projects so that persistent state can be accessed read or write anywhere within the apps using a simple interface.

    Resources

    I'm not specifically looking for help but welcome other input.


    Saline (state deployment control and monitoring tool for SUSE Manager/Uyuni) by vizhestkov

    Project Description

    Saline is an addition for salt used in SUSE Manager/Uyuni aimed to provide better control and visibility for states deploymend in the large scale environments.

    In current state the published version can be used only as a Prometheus exporter and missing some of the key features implemented in PoC (not published). Now it can provide metrics related to salt events and state apply process on the minions. But there is no control on this process implemented yet.

    Continue with implementation of the missing features and improve the existing implementation:

    • authentication (need to decide how it should be/or not related to salt auth)

    • web service providing the control of states deployment

    Goal for this Hackweek

    • Implement missing key features

    • Implement the tool for state deployment control with CLI

    Resources

    https://github.com/openSUSE/saline