There seems to be an overall consensus that the ioctl interface used by ethtool is a poor design as it's inflexible, error prone and notoriously hard to extend. It should clearly be replaced by netlink and obsoleted. Unfortunately not much actual work has been done in that direction until this project started.

The project started in Hackweek 16 (fall 2017) and has been worked on since, both in Hackweek 17-19 and outside. First two parts of kernel implementation are in mainline since 5.6-rc1, first part of userspace implementation (ethtool utility) has been submitted to upstream at the end of Hackweek 19 (2020-02-16).

Links:

Looking for hackers with the skills:

kernel networking netlink c

This project is part of:

Hack Week 16 Hack Week 17 Hack Week 18 Hack Week 19

Activity

  • about 5 years ago: jiriwiesner left this project.
  • almost 7 years ago: dsterba liked this project.
  • almost 7 years ago: pvorel liked this project.
  • over 7 years ago: a_faerber liked this project.
  • over 7 years ago: jiriwiesner joined this project.
  • over 7 years ago: jiriwiesner liked this project.
  • over 7 years ago: mkubecek added keyword "kernel" to this project.
  • over 7 years ago: mkubecek added keyword "networking" to this project.
  • over 7 years ago: mkubecek added keyword "netlink" to this project.
  • over 7 years ago: mkubecek added keyword "c" to this project.
  • over 7 years ago: mkubecek started this project.
  • over 7 years ago: mkubecek originated this project.

  • Comments

    • mkubecek
      over 7 years ago by mkubecek | Reply

      I played a bit already, here is a small demo.

    • mkubecek
      over 7 years ago by mkubecek | Reply

      End-of-hackweek status: disappointing, to be honest. Because of a series of P0/P1 bugs, not much time was left to work on the project. The plan was to have 8-10 most common subcommands implemented on both kernel and ethtool side, ready for an RFC submission, and maybe also some work on wireshark dissector. Reality is much less impressive:

      • kernel: basic infrastructure, GET_DRVINFO, GET_SETTINGS and half of SET_SETTINGS
      • ethtool: basic infrastructure, ethtool -i

      Current code can be found in mkubece/ethnl on Github and home:mkubecek:ethnl OBS project. Let's hope we can get to an RFC submission by the end of this week (I would like to have ethtool ( | -s | -i ) working on both sides for that).

    • mkubecek
      almost 7 years ago by mkubecek | Reply

      Marked the project for Hackweek 17 but I won't actually have Hackweek 17 in the usual sense due to Netdev conference. Instead, the plan is to dedicate five days (probably five Fridays in near future) to the project.

      Current state: kernel supports GET_DRVINFO, {G,S}ET_SETTINGS and {G,S}ET_PARAMS, dumps for all supported requests and notifications for SETTINGS and PARAMS (DRVINFO is read only by nature so there shouldn't be anything to notify). All of these requests are also supported by ethtool; the code for dumps (use* as device name) and monitoring (use --monitor as first parameter) also exists but the parser needs a bit more work to be presentable.

    • mkubecek
      almost 6 years ago by mkubecek | Reply

      Status update just before Hackweek 18.

      Kernel series is almost ready for upstream submission of first part. The main missing part is unified request header and unified processing of it. There may also be some minor issues found in v5 review. Submitting v6 is the primary goal for Hackween 18 but it should only take part of the week (the smaller the better).

      More work needs to be done on userspace ethtool utility which is lagging behind a bit:

      • without EVENT notifications (rejected by upstream), monitor will need "lazy load" of string sets
      • some commands will require rtnetlink or devlink; rethink the context structure and socket handling
      • monitor needs to also listen to and react to rtnetlink and devlink notifications
      • syntax extensions need to be documented in man page
      • new debugging options showing structure of sent/received messages
      • use the same parser to interpret bad attribute offset from extack error/warning messages

    • mkubecek
      over 5 years ago by mkubecek | Reply

      Status update before Hackweek 19

      After few more review rounds and rewrites, initial part of the kernel series was accepted to net-next tree and reached mainline as part of the 5.6 merge window (so that it's going to appear in v5.6-rc1. What is in mainline at the moment allows netlink based implementation of

      • ethtool
      • ethtool s ...
      • ethtool -P (using rtnetlink)

      and corresponding notifications.

      The userspace ethtool counterpart, however, will need a lot of work to make it ready for upstream submission. As it is highly desirable to have it in ethtool 5.6 release, this is going to be the primary focus at least for the first part of Hackweek 19.

    • mkubecek
      over 5 years ago by mkubecek | Reply

      Status update after Hackweek 19

      TL;DR: The primary goal was achieved (12 minutes before the end of the week add-emoji ).

      Summary of the Hackweek 19 progress:

      • abstractions for netlink socket and message buffer separated from struct nl_context
      • rework the code generating multiple messages from one set of parameters (for ethtool -s)
      • rework "char bitset" parser (for ethtool -s wol ...)
      • drop bitfield32 helpers and other code no longer used
      • compatibility with existing test suite (make check succeeds now)
      • split some too long patches
      • some bugs fixed
      • lot of style cleanups
      • more comments and better commit messages
      • cover letter
      • current work in progress tracked on github again

      The resulting series has been submitted to upstream.

    Similar Projects

    early stage kdump support by mbrugger

    Project Description

    When we experience a early boot crash, we are not able to analyze the kernel dump, as user-space wasn't able to load the crash system. The idea is to make the crash system compiled into the host kernel (think of initramfs) so that we can create a kernel dump really early in the boot process.

    Goal for the Hackweeks

    1. Investigate if this is possible and the implications it would have (done in HW21)
    2. Hack up a PoC (done in HW22 and HW23)
    3. Prepare RFC series (giving it's only one week, we are entering wishful thinking territory here).

    update HW23

    • I was able to include the crash kernel into the kernel Image.
    • I'll need to find a way to load that from init/main.c:start_kernel() probably after kcsan_init()
    • I workaround for a smoke test was to hack kexec_file_load() systemcall which has two problems:
      1. My initramfs in the porduction kernel does not have a new enough kexec version, that's not a blocker but where the week ended
      2. As the crash kernel is part of init.data it will be already stale once I can call kexec_file_load() from user-space.

    The solution is probably to rewrite the POC so that the invocation can be done from init.text (that's my theory) but I'm not sure if I can reuse the kexec infrastructure in the kernel from there, which I rely on heavily.

    update HW24

    • Day1
      • rebased on v6.12 with no problems others then me breaking the config
      • setting up a new compilation and qemu/virtme env
      • getting desperate as nothing works that used to work
    • Day 2
      • getting to call the invocation of loading the early kernel from __init after kcsan_init()
    • Day 3

      • fix problem of memdup not being able to alloc so much memory... use 64K page sizes for now
      • code refactoring
      • I'm now able to load the crash kernel
      • When using virtme I can boot into the crash kernel, also it doesn't boot completely (major milestone!), crash in elfcorehdr_read_notes()
    • Day 4

      • crash systems crashes (no pun intended) in copy_old_mempage() link; will need to understand elfcorehdr...
      • call path vmcore_init() -> parse_crash_elf_headers() -> elfcorehdr_read() -> read_from_oldmem() -> copy_oldmem_page() -> copy_to_iter()
    • Day 5

      • hacking arch/arm64/kernel/crash_dump.c:copy_old_mempage() to see if crash system really starts. It does.
      • fun fact: retested with more reserved memory and with UEFI FW, host kernel crashes in init but directly starts the crash kernel, so it works (somehow) \o/
    • TODOs

      • fix elfcorehdr so that we actually can make use of all this...
      • test where in the boot __init() chain we can/should call kexec_early_dump()