SUSE Hack Week: Improve UML page fault handler

Description

Improve UML handling of segmentation faults in kernel mode. Although such page faults are generally caused by a kernel bug, it is annoying if they cause an infinite loop, or panic the kernel. More importantly, a robust implementation allows to write KUnit tests for various guard pages, preventing potential kernel self-protection regressions.

Goals

Convert the UML page fault handler to use oops_* helpers, go through a few review rounds and finally get my patch series merged in 6.14.

Resources

Wrong initial attempt: https://lore.kernel.org/lkml/20231215121431.680-1-petrtesarik@huaweicloud.com/T/

Join this project Leave this project

Looking for hackers with the skills:

kernel

This project is part of:

Hack Week 24

Activity

8 months ago: michals liked this project.

8 months ago: ptesarik added keyword "kernel" to this project.

8 months ago: ptesarik started this project.

8 months ago: ptesarik originated this project.

Comments

8 months ago by ptesarik | Reply

I intend to work on this project whenever I get stuck with https://hackweek.opensuse.org/24/projects/kill-dma-and-dma32-memory-zones So, it you like the idea, feel free to grab it, just add a comment here.

Similar Projects

kernel

early stage kdump support by mbrugger

Project Description

When we experience a early boot crash, we are not able to analyze the kernel dump, as user-space wasn't able to load the crash system. The idea is to make the crash system compiled into the host kernel (think of initramfs) so that we can create a kernel dump really early in the boot process.

Goal for the Hackweeks

Investigate if this is possible and the implications it would have (done in HW21)
Hack up a PoC (done in HW22 and HW23)
Prepare RFC series (giving it's only one week, we are entering wishful thinking territory here).

update HW23

I was able to include the crash kernel into the kernel Image.
I'll need to find a way to load that from init/main.c:start_kernel() probably after kcsan_init()
I workaround for a smoke test was to hack kexec_file_load() systemcall which has two problems:
1. My initramfs in the porduction kernel does not have a new enough kexec version, that's not a blocker but where the week ended
2. As the crash kernel is part of init.data it will be already stale once I can call kexec_file_load() from user-space.

The solution is probably to rewrite the POC so that the invocation can be done from init.text (that's my theory) but I'm not sure if I can reuse the kexec infrastructure in the kernel from there, which I rely on heavily.

update HW24

Day1
- rebased on v6.12 with no problems others then me breaking the config
- setting up a new compilation and qemu/virtme env
- getting desperate as nothing works that used to work
Day 2
- getting to call the invocation of loading the early kernel from __init after kcsan_init()
Day 3
- fix problem of memdup not being able to alloc so much memory... use 64K page sizes for now
- code refactoring
- I'm now able to load the crash kernel
- When using virtme I can boot into the crash kernel, also it doesn't boot completely (major milestone!), crash in elfcorehdr_read_notes()
Day 4
- crash systems crashes (no pun intended) in copy_old_mempage() link; will need to understand elfcorehdr...
- call path vmcore_init() -> parse_crash_elf_headers() -> elfcorehdr_read() -> read_from_oldmem() -> copy_oldmem_page() -> copy_to_iter()
Day 5
- hacking arch/arm64/kernel/crash_dump.c:copy_old_mempage() to see if crash system really starts. It does.
- fun fact: retested with more reserved memory and with UEFI FW, host kernel crashes in init but directly starts the crash kernel, so it works (somehow) \o/
TODOs
- fix elfcorehdr so that we actually can make use of all this...
- test where in the boot __init() chain we can/should call kexec_early_dump()

Description

Goals

Resources

Looking for hackers with the skills:

This project is part of:

Activity

Comments

8 months ago by ptesarik | Reply

Similar Projects

kernel

early stage kdump support by mbrugger

Project Description

Goal for the Hackweeks

update HW23

update HW24