The current crash-setup source is located here.
Pretty much is working nicely but it doesn't care for the debug source making it impossible to use the crash> gdb list *<symbol>
command right away. This is bug 997558 which should be worked on.
A further script would be really useful to build up the analysis environment on l3slave for core dumps of crashed user-space applications. The MF Open Enterprise Server (OES) has a tool called NetIQ GetCore Utility
which prepares such an environment with files opencore.sh
, opencore.ini
, the binaries of the executable and libs, and the core dump itself. The files it generates can be used as a base and prove that it is possible to build up a core dump analysis environment decoupled from the host system.
What it is missing is setting up the debug symbols and the debug source as well. I've done that manually already. So this is possible and should be automated.
Both these topics are efficiency improvement tasks for the L3 work.
This project is part of:
Hack Week 17
Activity
Comments
-
about 6 years ago by sparschauer | Reply
The core-setup project should not depend on l3slave and should work with openSUSE for non-employees as well. It should have different modes for:
- analysis on the system which collected the dump:
- auto-get debuginfo and debugsource packages with zypper - packages are latest? (recommended for reproducible segfaults)
- include paths to custom code like e.g. scanmem which I compile from source as an upstream maintainer
- analysis on another system
- auto-get and extract RPMs from SUSE servers like e.g.
http://download.opensuse.org/update/leap/42.3/oss/x86_64/
,http://download.opensuse.org/debug/update/leap/42.3/oss/x86_64/
- auto-get and extract RPMs from SUSE servers like e.g.
- SLES
- get packages from NFS mounts
- get packages from *.suse.de servers
- openSUSE
- get packages from download.opensuse.org or mirrors
Key features:
- read coredump, detect crashed process and libs
- convert build-id to package name + version with the help of SUSE repos
- maybe use rpm.txt from supportconfig if exact package info cannot be gathered from the dump
- get and extract the required binary, debuginfo, and debugsource packages
- build up opencore.sh and opencore.ini - maybe with support for custom settings
Initial tasks:
- check for, evaluate, and analyze similar tools
- analyze coredump structure - What is included and can be used?
Main goals:
- prevent gdb from picking up wrong code/source/debuginfo or wrong packages
- take needed code parts from other FOSS projects to avoid duplicate efforts
- analysis on the system which collected the dump:
-
about 6 years ago by sparschauer | Reply
For
crash-setup -d vmcore
, crash-setup uses kdumpid which uses libkdumpfile.Example output of
kdumpid
:Format: compressed kdump
Arch: x86_64
Version: 4.4.138-94.39-default-
about 6 years ago by sparschauer | Reply
crash-setup
uses kernel-source.git to convert the version into a git tag, then to a commit, and then to the oldest git branch containing it. This step is extremely slow as it is done by comparing the commit id in the git log of all the configured git branches from old to new. The SLE release is gathered from this. I'm sure this can be sped up.Example: 4.4.138-94.39 -> rpm-4.4.138-94.39 -> baa07f9df91b -> SLE12-SP3
-
about 6 years ago by sparschauer | Reply
git describe
is even slower. Only configuring major.minor per branch would help.
-
-
-
about 6 years ago by sparschauer | Reply
I've introduced a new crash-setup option '-s' or '--source' to get the debugsource package automatically as well. With the crash
cd
command I can get to the right directory so that thegdb list
command will work. The idea is to create a fileopencrash.sh
containingcrash -i ./opencrash.ini vmlinux.gz vmlinux.debug vmcore
and a fileopencrash.ini
containing e.g.cd ./root/usr/src/debug/kernel-default-4.4.138/linux-4.4/kernel
automatically.-
about 6 years ago by sparschauer | Reply
Autocreation of
opencrash.sh/.ini
implemented. Since SLE12 there are different paths to source files. Socd root
is used for SLE11 and before.Commits can be found here.
-
-
about 6 years ago by sparschauer | Reply
I worked on the core-setup already.
gdb -ex "quit" -c core
can get the build id of the main program running. Then we can look intorepodata/*primary.xml.gz
and search for it. The next "location href" line above shows the name of the debuginfo package.Example:
$ ./core-setup ./core.hald.2341
main build id from core.hald.2341: da45f02baff8ef519840d5a17fd15926f2c802e2
looking up main build id...
x86_64/hal-debuginfo-0.5.12-23.76.1.x86_64.rpmNow we can remove "-debuginfo" from the RPM name and get the standard package name "hal-0.5.12-23.76.1.x86_64.rpm". With these two packages it is possible to run gdb regularly and to gather the build ids of the libraries.
-
about 6 years ago by sparschauer | Reply
I made a lot of progress with core-setup. It handles the hald example core dump pretty well already. The CPU arch, the crashed executable, and the main build id are gathered from
gdb -batch -ex "info auxv" -c core
. Besides the packages for the crashed executable, the debug and standard packages for 20 of 22 libraries are fetched and extracted.Unfortunately, just removing "-debuginfo" from the package name is not enough. There are more standard packages required for the same debuginfo package. The biggest issue is that the build id is not stored in the coredump and only reflects the standard packages of the host system. So the wrong package versions are gathered.
Looks like I have to parse rpm.txt from the supportconfig for the right versions and check which package provides the library file which has to be extracted first and then make sure that really the correct files are picked up by gdb.
-
about 6 years ago by sparschauer | Reply
At least this method works pretty well with the
NetIQ GetCore Utility
which gathers all loaded ELF files from the crashed system. This way the correct build ids and therefore debuginfo packages are picked up.
-
-
over 5 years ago by sparschauer | Reply
@alnovak ignored my merge request for crash-setup. Using the change as custom enhancement if nobody else is interested in it. The core-setup idea has proven working when I picked up the rpms from the supportconfig
rpm.txt
manually for L3 bug 1105883.
Similar Projects