an invention by pvorel
Project Description
Qualcomm concentrate on supporting recent SOC, older ones aren't supported (use very old downstream kernel, e.g. 3.10).
Goal for this Hackweek
1. Boot mainline kernel to initramfs
I managed to boot mainline kernel (5.9.0-rc1 at the time) on msm89xx on my phone, but kernel crashes. Find why.
$ fastboot -c "debug ignore_loglevel earlycon" boot boot.img
[74500] Continuous splash enabled, keeping panel alive.
[74500] booting linux @ 0x80000, ramdisk @ 0x2700000 (1236022), tags/device tree @ 0x2500000
[74510] Jumping to kernel via monitor
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd032]
[ 0.000000] Linux version 5.9.0-rc1-00020-gabea2a011c54 (pmos@localhost) (aarch64-alpine-linux-musl-gcc (Alpine 10.2.0) 10.2.0, GNU ld (GNU Binutils) 2.35.1) #3 SMP PREEMPT Tue Oct 6 12:24:55 UTC 2020
[ 0.000000] printk: debug: ignoring loglevel setting.
[ 0.000000] efi: UEFI not found.
[ 0.000000] [Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
[ 0.000000] cma: Reserved 32 MiB at 0x00000000de000000
[ 0.000000] earlycon: msm_serial_dm0 at MMIO 0x00000000f991e000 (options '115200n8')
[ 0.000000] printk: bootconsole [msm_serial_dm0] enabled
...
[ 0.185155] pinctrl core: initialized pinctrl subsystem
[ 0.191761] DMI not present or invalid.
[ 0.196193] NET: Registered protocol family 16
[ 0.200769] DMA: preallocated 4096 KiB GFP_KERNEL pool for atomic allocations
[ 0.204611] DMA: preallocated 4096 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[ 0.211812] DMA: preallocated 4096 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[ 0.218962] audit: initializing netlink subsys (disabled)
[ 0.228072] thermal_sys: Registered thermal governor 'step_wise'
[ 0.228076] thermal_sys: Registered thermal governor 'power_allocator'
[ 0.232293] audit: type=2000 audit(0.144:1): state=initialized audit_enabled=0 res=1
[ 0.244954] cpuidle: using governor menu
[ 0.253102] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[ 0.256526] ASID allocator initialised with 32768 entries
[ 0.264698] Serial: AMBA PL011 UART driver
... RESET AND DOWNSTREAM KERNEL continues :(
D - 15524 - pm_driver_init, Delta
2. Explore current Qualcomm mainlining kernel effort
Resources
Results
- https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=for-next&id=f890f89d9a80fffbfa7ca791b78927e5b8aba869
- https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=for-next&id=9d1fc2e4f5a94a492c7dd1ca577c66fdb7571c84
- https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=for-next&id=3cb6a271f4b04f11270111638c24fa5c0b846dec
- https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=for-next&id=0e5ded926f2a0f8b57dfa7f0d69a30767e1ea2ce
This project is part of:
Hack Week 20
Activity
Comments
-
about 4 years ago by pvorel | Reply
My fix posted into linux-arm-msm ML got accepted https://lore.kernel.org/linux-arm-msm/20210415193913.1836153-1-petr.vorel@gmail.com/ https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=for-next&id=f890f89d9a80fffbfa7ca791b78927e5b8aba869
Unfortunately there is still issue preventing boot with commit 86588296acbf ("fdt: Properly handle "no-map" field in the memory region"). Reverting it allows booting. I need to have look into this issue.
Similar Projects
early stage kdump support by mbrugger
Project Description
When we experience a early boot crash, we are not able to analyze the kernel dump, as user-space wasn't able to load the crash system. The idea is to make the crash system compiled into the host kernel (think of initramfs) so that we can create a kernel dump really early in the boot process.
Goal for the Hackweeks
- Investigate if this is possible and the implications it would have (done in HW21)
- Hack up a PoC (done in HW22 and HW23)
- Prepare RFC series (giving it's only one week, we are entering wishful thinking territory here).
update HW23
- I was able to include the crash kernel into the kernel Image.
- I'll need to find a way to load that from
init/main.c:start_kernel()
probably afterkcsan_init()
- I workaround for a smoke test was to hack
kexec_file_load()
systemcall which has two problems:- My initramfs in the porduction kernel does not have a new enough kexec version, that's not a blocker but where the week ended
- As the crash kernel is part of init.data it will be already stale once I can call
kexec_file_load()
from user-space.
The solution is probably to rewrite the POC so that the invocation can be done from init.text (that's my theory) but I'm not sure if I can reuse the kexec infrastructure in the kernel from there, which I rely on heavily.
update HW24
- Day1
- rebased on v6.12 with no problems others then me breaking the config
- setting up a new compilation and qemu/virtme env
- getting desperate as nothing works that used to work
- Day 2
- getting to call the invocation of loading the early kernel from
__init
afterkcsan_init()
- getting to call the invocation of loading the early kernel from
Day 3
- fix problem of memdup not being able to alloc so much memory... use 64K page sizes for now
- code refactoring
- I'm now able to load the crash kernel
- When using virtme I can boot into the crash kernel, also it doesn't boot completely (major milestone!), crash in
elfcorehdr_read_notes()
Day 4
- crash systems crashes (no pun intended) in
copy_old_mempage()
link; will need to understand elfcorehdr... - call path
vmcore_init() -> parse_crash_elf_headers() -> elfcorehdr_read() -> read_from_oldmem() -> copy_oldmem_page() -> copy_to_iter()
- crash systems crashes (no pun intended) in
Day 5
- hacking
arch/arm64/kernel/crash_dump.c:copy_old_mempage()
to see if crash system really starts. It does. - fun fact: retested with more reserved memory and with UEFI FW, host kernel crashes in init but directly starts the crash kernel, so it works (somehow) \o/
- hacking
TODOs
- fix elfcorehdr so that we actually can make use of all this...
- test where in the boot
__init()
chain we can/should callkexec_early_dump()