Project Description

BPF verifier plays a crucial role in securing the system (though less so now that unprivileged BPF is disabled by default in both upstream and SLES), and bugs in the verifier has lead to privilege escalation vulnerabilities in the past (e.g. CVE-2021-3490).

One way to check whether the verifer has bugs to use model checking (a formal verification technique), in other words, build a abstract model of how the verifier operates, and then see if certain condition can occur (e.g. incorrect calculation during value tracking of registers) by giving both the model and condition to a solver.

For the solver I will be using the Z3 SMT solver to do the checking since it provide a Python binding that's relatively easy to use.

Goal for this Hackweek

Learn how to use the Z3 Python binding (i.e. Z3Py) to build a model of (part of) the BPF verifier, probably the part that's related to value tracking using tristate numbers (aka tnum), and then check that the algorithm work as intended.

Resources

Looking for hackers with the skills:

bpf ebpf formalverification modelchecking kernel security

This project is part of:

Hack Week 21 Hack Week 23 Hack Week 24

Activity

  • 28 days ago: janvhs liked this project.
  • 28 days ago: flonnegren liked this project.
  • about 1 month ago: vliaskovitis liked this project.
  • about 2 months ago: r1chard-lyu liked this project.
  • about 1 year ago: moio liked this project.
  • over 2 years ago: dsterba liked this project.
  • over 2 years ago: mbrugger liked this project.
  • over 2 years ago: jzerebecki left this project.
  • over 2 years ago: jzerebecki added keyword "security" to this project.
  • over 2 years ago: jzerebecki joined this project.
  • over 2 years ago: jzerebecki liked this project.
  • over 2 years ago: ailiopoulos liked this project.
  • over 2 years ago: shunghsiyu started this project.
  • over 2 years ago: shunghsiyu added keyword "kernel" to this project.
  • over 2 years ago: shunghsiyu added keyword "bpf" to this project.
  • over 2 years ago: shunghsiyu added keyword "ebpf" to this project.
  • over 2 years ago: shunghsiyu added keyword "formalverification" to this project.
  • over 2 years ago: shunghsiyu added keyword "modelchecking" to this project.
  • over 2 years ago: shunghsiyu originated this project.

  • Comments

    • shunghsiyu
      over 2 years ago by shunghsiyu | Reply

      I've uploaded the jupyter notebook on GitHub that contains a minimal model of tnum along with tnum_add(), as well as the prove that it works.

    • shunghsiyu
      over 2 years ago by shunghsiyu | Reply

      The slides used for lightning talk can be found here

      While I'd like to achieve much more, I think what I've done during hack week is suffice to be called a complete project, so I'm marking this as complete

    • shunghsiyu
      about 1 year ago by shunghsiyu | Reply

      I'm restarting this project to model check the range tracking (minimal and maximal value possible in s32/u32/u64/s64 range) done in BPF verifier.

      In addition to that I hope to unify signed and unsigned range tracking based on previously posted idea and the "Interval Analysis and Machine Arithmetic: Why Signedness Ignorance Is Bliss" paper for simpler range tracking code.

    Similar Projects

    Tracing system calls with eBPF by doreilly

    Description

    Many security tools need to record system calls like execve. Using the Linux audit system for this can have a detrimental performance impact in some cases.

    Goals

    The goal is to investigate eBPF as an alternative and do some benchmarking to see the impact and how it compares to using the audit subsystem.

    Progress

    BPF done - traceexec

    Benchmark report

    Resources

    eBPF doc

    libbpf

    libMicro benchmark tool


    Tracing system calls with eBPF by doreilly

    Description

    Many security tools need to record system calls like execve. Using the Linux audit system for this can have a detrimental performance impact in some cases.

    Goals

    The goal is to investigate eBPF as an alternative and do some benchmarking to see the impact and how it compares to using the audit subsystem.

    Progress

    BPF done - traceexec

    Benchmark report

    Resources

    eBPF doc

    libbpf

    libMicro benchmark tool


    early stage kdump support by mbrugger

    Project Description

    When we experience a early boot crash, we are not able to analyze the kernel dump, as user-space wasn't able to load the crash system. The idea is to make the crash system compiled into the host kernel (think of initramfs) so that we can create a kernel dump really early in the boot process.

    Goal for the Hackweeks

    1. Investigate if this is possible and the implications it would have (done in HW21)
    2. Hack up a PoC (done in HW22 and HW23)
    3. Prepare RFC series (giving it's only one week, we are entering wishful thinking territory here).

    update HW23

    • I was able to include the crash kernel into the kernel Image.
    • I'll need to find a way to load that from init/main.c:start_kernel() probably after kcsan_init()
    • I workaround for a smoke test was to hack kexec_file_load() systemcall which has two problems:
      1. My initramfs in the porduction kernel does not have a new enough kexec version, that's not a blocker but where the week ended
      2. As the crash kernel is part of init.data it will be already stale once I can call kexec_file_load() from user-space.

    The solution is probably to rewrite the POC so that the invocation can be done from init.text (that's my theory) but I'm not sure if I can reuse the kexec infrastructure in the kernel from there, which I rely on heavily.

    update HW24

    • Day1
      • rebased on v6.12 with no problems others then me breaking the config
      • setting up a new compilation and qemu/virtme env
      • getting desperate as nothing works that used to work
    • Day 2
      • getting to call the invocation of loading the early kernel from __init after kcsan_init()
    • Day 3

      • fix problem of memdup not being able to alloc so much memory... use 64K page sizes for now
      • code refactoring
      • I'm now able to load the crash kernel
      • When using virtme I can boot into the crash kernel, also it doesn't boot completely (major milestone!), crash in elfcorehdr_read_notes()
    • Day 4

      • crash systems crashes (no pun intended) in copy_old_mempage() link; will need to understand elfcorehdr...
      • call path vmcore_init() -> parse_crash_elf_headers() -> elfcorehdr_read() -> read_from_oldmem() -> copy_oldmem_page() -> copy_to_iter()
    • Day 5

      • hacking arch/arm64/kernel/crash_dump.c:copy_old_mempage() to see if crash system really starts. It does.
      • fun fact: retested with more reserved memory and with UEFI FW, host kernel crashes in init but directly starts the crash kernel, so it works (somehow) \o/
    • TODOs

      • fix elfcorehdr so that we actually can make use of all this...
      • test where in the boot __init() chain we can/should call kexec_early_dump()


    Hacking on sched_ext by flonnegren

    Description

    Sched_ext upstream has some interesting issues open for grabs:

    Goals

    Send patches to sched_ext upstream

    Also set up perfetto to trace some of the example schedulers.

    Resources

    https://github.com/sched-ext/scx


    Improve UML page fault handler by ptesarik

    Description

    Improve UML handling of segmentation faults in kernel mode. Although such page faults are generally caused by a kernel bug, it is annoying if they cause an infinite loop, or panic the kernel. More importantly, a robust implementation allows to write KUnit tests for various guard pages, preventing potential kernel self-protection regressions.

    Goals

    Convert the UML page fault handler to use oops_* helpers, go through a few review rounds and finally get my patch series merged in 6.14.

    Resources

    Wrong initial attempt: https://lore.kernel.org/lkml/20231215121431.680-1-petrtesarik@huaweicloud.com/T/


    Modernize ocfs2 by goldwynr

    Ocfs2 has gone into a stage of neglect and disrepair. Modernize the code to generate enough interest.

    Goals: * Change the mount sequence to use fscontext * Move from using bufferhead to bio/folios * Use iomap * Run it through xfstests


    RISC-V emulator in GLSL capable of running Linux by favogt

    Description

    There are already numerous ways to run Linux and some programs through emulation in a web browser (e.g. x86 and riscv64 on https://bellard.org/jslinux/), but none use WebGL/WebGPU to run the emulation on the GPU.

    I already made a PoC of an AArch64 (64-bit Arm) emulator in OpenCL which is unfortunately hindered by a multitude of OpenCL compiler bugs on all platforms (Intel with beignet or the new compute runtime and AMD with Mesa Clover and rusticl). With more widespread and thus less broken GLSL vs. OpenCL and the less complex implementation requirements for RV32 (especially 32bit integers instead of 64bit), that should not be a major problem anymore.

    Goals

    Write an RISC-V system emulator in GLSL that is capable of booting Linux and run some userspace programs interactively. Ideally it is small enough to work on online test platforms like Shaderoo with a custom texture that contains bootstrap code, kernel and initrd.

    Minimum:

    riscv32 without FPU (RV32 IMA) and MMU (µClinux), running Linux in M-mode and userspace in U-mode.

    Stretch goals:

    FPU support, S-Mode support with MMU, SMP. Custom web frontend with more possibilities for I/O (disk image, network?).

    Resources

    RISC-V ISA Specifications
    Shaderoo
    OpenGL 4.5 Quick Reference Card

    Result as of Hackweek 2024

    WebGL turned out to be insufficient, it only supports OpenGL ES 3.0 but imageLoad/imageStore needs ES 3.1. So we switched directions and had to write a native C++ host for the shaders.

    As of Hackweek Friday, the kernel attempts to boot and outputs messages, but panics due to missing memory regions.

    Since then, some bugs were fixed and enough hardware emulation implemented, so that now Linux boots with framebuffer support and it's possible to log in and run programs!

    The repo with a demo video is available at https://github.com/Vogtinator/risky-v


    CVE portal for SUSE Rancher products by gmacedo

    Description

    Currently it's a bit difficult for users to quickly see the list of CVEs affecting images in Rancher, RKE2, Harvester and Longhorn releases. Users need to individually look for each CVE in the SUSE CVE database page - https://www.suse.com/security/cve/ . This is not optimal, because those CVE pages are a bit hard to read and contain data for all SLE and BCI products too, making it difficult to easily see only the CVEs affecting the latest release of Rancher, for example. We understand that certain costumers are only looking for CVE data for Rancher and not SLE or BCI.

    Goals

    The objective is to create a simple to read and navigate page that contains only CVE data related to Rancher, RKE2, Harvester and Longhorn, where it's easy to search by a CVE ID, an image name or a release version. The page should also provide the raw data as an exportable CSV file.

    It must be an MVP with the minimal amount of effort/time invested, but still providing great value to our users and saving the wasted time that the Rancher Security team needs to spend by manually sharing such data. It might not be long lived, as it can be replaced in 2-3 years with a better SUSE wide solution.

    Resources

    • The page must be simple and easy to read.
    • The UI/UX must be as straightforward as possible with minimal visual noise.
    • The content must be created automatically from the raw data that we already have internally.
    • It must be updated automatically on a daily basis and on ad-hoc runs (when needed).
    • The CVE status must be aligned with VEX.
    • The raw data must be exportable as CSV file.
    • Ideally it will be written in Go or pure Shell script with basic HTML and no external dependencies in CSS or JS.


    VulnHeap by r1chard-lyu

    Description

    The VulnHeap project is dedicated to the in-depth analysis and exploitation of vulnerabilities within heap memory management. It focuses on understanding the intricate workflow of heap allocation, chunk structures, and bin management, which are essential to identifying and mitigating security risks.

    Goals

    • Familiarize with heap
      • Heap workflow
      • Chunk and bin structure
      • Vulnerabilities
    • Vulnerability
      • Use after free (UAF)
      • Heap overflow
      • Double free
    • Use Docker to create a vulnerable environment and apply techniques to exploit it

    Resources

    • https://heap-exploitation.dhavalkapil.com/divingintoglibc_heap
    • https://raw.githubusercontent.com/cloudburst/libheap/master/heap.png
    • https://github.com/shellphish/how2heap?tab=readme-ov-file


    Migrate from Docker to Podman by tjyrinki_suse

    Description

    I'd like to continue my former work on containerization of several domains on a single server by changing from Docker containers to Podman containers. That will need an OS upgrade as well as Podman is not available in that old server version.

    Goals

    • Update OS.
    • Migrate from Docker to Podman.
    • Keep everything functional, including the existing "meanwhile done" additional Docker container that is actually being used already.
    • Keep everything at least as secure as currently. One of the reasons of having the containers is to isolate risks related to services open to public Internet.
    • Try to enable the Podman use in production.
    • At minimum, learn about all of these topics.
    • Optionally, improve Ansible side of things as well...

    Resources

    A search engine is one's friend. Migrating from Docker to Podman, and from docker-compose to podman-compose.


    Contributing to Linux Kernel security by pperego

    Description

    A couple of weeks ago, I found this blog post by Gustavo Silva, a Linux Kernel contributor.

    I always strived to start again into hacking the Linux Kernel, so I asked Coverity scan dashboard access and I want to contribute to Linux Kernel by fixing some minor issues.

    I want also to create a Linux Kernel fuzzing lab using qemu and syzkaller

    Goals

    1. Fix at least 2 security bugs
    2. Create the fuzzing lab and having it running

    The story so far

    • Day 1: setting up a virtual machine for kernel development using Tumbleweed. Reading a lot of documentation, taking confidence with Coverity dashboard and with procedures to submit a kernel patch
    • Day 2: I read really a lot of documentation and I triaged some findings on Coverity SAST dashboard. I have to confirm that SAST tool are great false positives generator, even for low hanging fruits.
    • Day 3: Working on trivial changes after I read this blog post: https://www.toblux.com/posts/2024/02/linux-kernel-patches.html. I have to take confidence with the patch preparation and submit process yet.
      • First trivial patch sent: using strtruefalse() macro instead of hard-coded strings in a staging driver for a lcd display
      • Fix for a dereference before null check issue discovered by Coverity (CID 1601566) https://scan7.scan.coverity.com/#/project-view/52110/11354?selectedIssue=1601566
    • Day 4: Triaging more issues found by Coverity.
      • The patch for CID 1601566 was refused. The check against the NULL pointer was pointless so I prepared a version 2 of the patch removing the check.
      • Fixed another dereference before NULL check in iwlmvmparsewowlaninfo_notif() routine (CID 1601547). This one was already submitted by another kernel hacker :(
    • Day 5: Wrapping up. I had to do some minor rework on patch for CID 1601566. I found a stalker bothering me in private emails and people I interacted with me, advised he is a well known bothering person. Markus Elfring for the record.
    • Wrapping up: being back doing kernel hacking is amazing and I don't want to stop it. My battery pack is completely drained but changing the scope gave me a great twist and I really want to feel this energy not doing a single task for months.

      I failed in setting up a fuzzing lab but I was too optimistic for the patch submission process.

    The patches

    1


    Bot to identify reserved data leak in local files or when publishing on remote repository by mdati

    Description

    Scope here is to prevent reserved data or generally "unwanted", to be pushed and saved on a public repository, i.e. on Github, causing disclosure or leaking of reserved informations.

    The above definition of reserved or "unwanted" may vary, depending on the context: sometime secret keys or password are stored in data or configuration files or hardcoded in source code and depending on the scope of the archive or the level of security, it can be either wanted, permitted or not at all.

    As main target here, secrets will be registration keys or passwords, to be detected and managed locally or in a C.I. pipeline.

    Goals

    • Detection:

      • Local detection: detect secret words present in local files;
      • Remote detection: detect secrets in files, in pipelines, going to be transferred on a remote repository, i.e. via git push;
    • Reporting:

      • report the result of detection on stderr and/or log files, noticed excluding the secret values.
    • Acton:

      • Manage the detection, by either deleting or masking the impacted code or deleting/moving the file itself or simply notify it.

    Resources

    • Project repository, published on Github (link): m-dati/hkwk24;
    • Reference folder: hkwk24/chksecret;
    • First pull request (link): PR#1;
    • Second PR, for improvements: PR#2;
    • README.md and TESTS.md documentation files available in the repo root;
    • Test subproject repository, for testing CI on push [TBD].

    Notes

    We use here some examples of secret words, that still can be improved.
    The various patterns to match desired reserved words are written in a separated module, to be on demand updated or customized.

    [Legend: TBD = to be done]