SUSE Hack Week: L3: Improve crash-setup, develop a core-setup

The current crash-setup source is located here. Pretty much is working nicely but it doesn't care for the debug source making it impossible to use the crash> gdb list * command right away. This is bug 997558 which should be worked on.

A further script would be really useful to build up the analysis environment on l3slave for core dumps of crashed user-space applications. The MF Open Enterprise Server (OES) has a tool called NetIQ GetCore Utility which prepares such an environment with files opencore.sh, opencore.ini, the binaries of the executable and libs, and the core dump itself. The files it generates can be used as a base and prove that it is possible to build up a core dump analysis environment decoupled from the host system. What it is missing is setting up the debug symbols and the debug source as well. I've done that manually already. So this is possible and should be automated.

Both these topics are efficiency improvement tasks for the L3 work.

Join this project Leave this project

Looking for hackers with the skills:

shell c

This project is part of:

Hack Week 17

Activity

over 7 years ago: mkubecek liked this project.

over 7 years ago: sparschauer liked this project.

over 7 years ago: sparschauer added keyword "shell" to this project.

over 7 years ago: sparschauer added keyword "c" to this project.

over 7 years ago: mkoutny liked this project.

over 7 years ago: mbrugger liked this project.

over 7 years ago: sparschauer started this project.

over 7 years ago: sparschauer originated this project.

Comments

over 7 years ago by sparschauer | Reply

The core-setup project should not depend on l3slave and should work with openSUSE for non-employees as well. It should have different modes for:
- analysis on the system which collected the dump:
  - auto-get debuginfo and debugsource packages with zypper - packages are latest? (recommended for reproducible segfaults)
  - include paths to custom code like e.g. scanmem which I compile from source as an upstream maintainer
- analysis on another system
  - auto-get and extract RPMs from SUSE servers like e.g. http://download.opensuse.org/update/leap/42.3/oss/x86_64/, http://download.opensuse.org/debug/update/leap/42.3/oss/x86_64/
- SLES
  - get packages from NFS mounts
  - get packages from *.suse.de servers
- openSUSE
  - get packages from download.opensuse.org or mirrors
Key features:
- read coredump, detect crashed process and libs
- convert build-id to package name + version with the help of SUSE repos
- maybe use rpm.txt from supportconfig if exact package info cannot be gathered from the dump
- get and extract the required binary, debuginfo, and debugsource packages
- build up opencore.sh and opencore.ini - maybe with support for custom settings
Initial tasks:
- check for, evaluate, and analyze similar tools
- analyze coredump structure - What is included and can be used?
Main goals:
- prevent gdb from picking up wrong code/source/debuginfo or wrong packages
- take needed code parts from other FOSS projects to avoid duplicate efforts

over 7 years ago by sparschauer | Reply

For crash-setup -d vmcore, crash-setup uses kdumpid which uses libkdumpfile.

Example output of kdumpid:

Format: compressed kdump
Arch: x86_64
Version: 4.4.138-94.39-default
- over 7 years ago by sparschauer | Reply
  
  crash-setup uses kernel-source.git to convert the version into a git tag, then to a commit, and then to the oldest git branch containing it. This step is extremely slow as it is done by comparing the commit id in the git log of all the configured git branches from old to new. The SLE release is gathered from this. I'm sure this can be sped up.
  
  Example: 4.4.138-94.39 -> rpm-4.4.138-94.39 -> baa07f9df91b -> SLE12-SP3
  - over 7 years ago by sparschauer | Reply
    
    git describe is even slower. Only configuring major.minor per branch would help.

over 7 years ago by sparschauer | Reply

I've introduced a new crash-setup option '-s' or '--source' to get the debugsource package automatically as well. With the crash cd command I can get to the right directory so that the gdb list command will work. The idea is to create a file opencrash.sh containing crash -i ./opencrash.ini vmlinux.gz vmlinux.debug vmcore and a file opencrash.ini containing e.g. cd ./root/usr/src/debug/kernel-default-4.4.138/linux-4.4/kernel automatically.
- over 7 years ago by sparschauer | Reply
  
  Autocreation of opencrash.sh/.ini implemented. Since SLE12 there are different paths to source files. So cd root is used for SLE11 and before.
  
  Commits can be found here.

over 7 years ago by sparschauer | Reply

Merge request for crash-setup submitted.

over 7 years ago by sparschauer | Reply

I worked on the core-setup already. gdb -ex "quit" -c core can get the build id of the main program running. Then we can look into repodata/*primary.xml.gz and search for it. The next "location href" line above shows the name of the debuginfo package.

Example:
$ ./core-setup ./core.hald.2341
main build id from core.hald.2341: da45f02baff8ef519840d5a17fd15926f2c802e2
looking up main build id...
x86_64/hal-debuginfo-0.5.12-23.76.1.x86_64.rpm

Now we can remove "-debuginfo" from the RPM name and get the standard package name "hal-0.5.12-23.76.1.x86_64.rpm". With these two packages it is possible to run gdb regularly and to gather the build ids of the libraries.

over 7 years ago by sparschauer | Reply

I made a lot of progress with core-setup. It handles the hald example core dump pretty well already. The CPU arch, the crashed executable, and the main build id are gathered from gdb -batch -ex "info auxv" -c core. Besides the packages for the crashed executable, the debug and standard packages for 20 of 22 libraries are fetched and extracted.

Unfortunately, just removing "-debuginfo" from the package name is not enough. There are more standard packages required for the same debuginfo package. The biggest issue is that the build id is not stored in the coredump and only reflects the standard packages of the host system. So the wrong package versions are gathered.

Looks like I have to parse rpm.txt from the supportconfig for the right versions and check which package provides the library file which has to be extracted first and then make sure that really the correct files are picked up by gdb.
- over 7 years ago by sparschauer | Reply
  
  At least this method works pretty well with the NetIQ GetCore Utility which gathers all loaded ELF files from the crashed system. This way the correct build ids and therefore debuginfo packages are picked up.

over 7 years ago by michalnowak | Reply

You may find Fedora's ABRT project aligned with your's.

almost 7 years ago by sparschauer | Reply

@alnovak ignored my merge request for crash-setup. Using the change as custom enhancement if nobody else is interested in it. The core-setup idea has proven working when I picked up the rpms from the supportconfig rpm.txt manually for L3 bug 1105883.

Similar Projects

shell

OS self documentation, health check and troubleshooting by roseswe

Project Description

The aim of this hackweek project is to improve the utility "cfg2html" so that it is even more usable under SLES and perhaps also under Rancher.

cfg2html (see also https://github.com/cfg2html/cfg2html) itself is a very mature utility for collecting and documenting information of an operating system like Linux, AIX, HP-UX and others.

Goal for this Hackweek

The aim is to extend cfg2html

for SLES and SLES-for-SAP apps, high availability
Improve code for MicroOS 5.x, SUMA, Edge and k8s environments
fix shellbeauity warnings
possibly add more plugins
SUMA/Salt integration to collect.

Resources

Required skills: Bash, shell script and the SUSE products mentioned.

https://github.com/cfg2html/cfg2html

https://www.cfg2html.com/

c

Smart lighting with Pico 2 by jmodak

Description

I am trying to create a smart-lighting project with a Raspberry Pi Pico that reacts to a movie's visuals and audio that involves combining two distinct functions: ambient screen lighting(visual response) and sound-reactive lighting(audio response)

Goals

Visuals: Capturing the screen's colour requires an external device to analyse screen content and send colour data to the MCU via serial communication.
Audio: A sound sensor module connected directly to the Pico that can detect sound volume.
Pico 2W: The MCU receives data fro, both inputs and controls an LED strip.

Resources

Raspberry Pi Pico 2 W
RGB LED strip
Sound detecting sensor
Power supply
breadboard and wires

Port OTPClient to GTK >= 4.18 by pstivanin

Project Description

OTPClient is currently using GTK3 and cannot easily be ported to GTK4. Since GTK4 came out, there have been quite some big changes. Also, there are now some new deprecation that will take effect with GTK5 (and are active starting from 4.10 as warnings), so I need to think ahead and port OTPClient without using any of those deprecated features.

Goal for this Hackweek

fix the last 3 opened issues (https://github.com/paolostivanin/OTPClient/issues/402, https://github.com/paolostivanin/OTPClient/issues/404, https://github.com/paolostivanin/OTPClient/issues/406) and release a new version
continue the rewrite from where we left last year
if possible, finally close this 6 years old issue: https://github.com/paolostivanin/OTPClient/issues/123

Improve the picotm Transaction Manager by tdz

Picotm is a system-level transaction manager. It provides transactional semantics to low-level C operations, such as

memory access,
modifying data structures,
(some) file I/O, and
common interfaces from the C Standard Library and POSIX.

Picotm also handles error detection and recovery for all it's functionality. It's fully modular, so new functionality can be added.

For the Hackweek, I want to dedicate some time to picotm. I want to finish some of the refactoring work that I have been working on. If there's time left, I'd like to investigate two-phase commits and how to support them in picotm.

Picotm is available at http://picotm.org/.

pudc - A PID 1 process that barks to the internet by mssola

Description

As a fun exercise in order to dig deeper into the Linux kernel, its interfaces, the RISC-V architecture, and all the dragons in between; I'm building a blog site cooked like this:

The backend is written in a mixture of C and RISC-V assembly.
The backend is actually PID1 (for real, not within a container).
We poll and parse incoming HTTP requests ourselves.
The frontend is a mere HTML page with htmx.

The project is meant to be Linux-specific, so I'm going to use io_uring, pidfs, namespaces, and Linux-specific features in order to drive all of this.

I'm open for suggestions and so on, but this is meant to be a solo project, as this is more of a learning exercise for me than anything else.

Goals

Have a better understanding of different Linux features from user space down to the kernel internals.
Most importantly: have fun.

Resources

https://github.com/mssola/pudc

Add a machine-readable output to dmidecode by jdelvare

Description

There have been repeated requests for a machine-friendly dmidecode output over the last decade. During Hack Week 19, 5 years ago, I prepared the code to support alternative output formats, but didn't have the time to go further. Last year, Jiri Hnidek from Red Hat Linux posted a proof-of-concept implementation to add JSON output support. This is a fairly large pull request which needs to be carefully reviewed and tested.

Goals

Review Jiri's work and provide constructive feedback. Merge the code if acceptable. Evaluate the costs and benefits of using a library such as json-c.

Looking for hackers with the skills:

This project is part of:

Activity

Comments

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by sparschauer | Reply

over 7 years ago by michalnowak | Reply

almost 7 years ago by sparschauer | Reply

Similar Projects

shell

OS self documentation, health check and troubleshooting by roseswe

Project Description

Goal for this Hackweek

Resources

c

Smart lighting with Pico 2 by jmodak

Description

Goals

Resources

Port OTPClient to GTK >= 4.18 by pstivanin

Project Description

Goal for this Hackweek

Improve the picotm Transaction Manager by tdz

pudc - A PID 1 process that barks to the internet by mssola

Description

Goals

Resources

Add a machine-readable output to dmidecode by jdelvare

Description

Goals