Project Description

Everything we do in SUSE requires a certain amount of energy. This energy has a cost and it causes also a certain amount of CO2 emissions. In particular, as Kernel QA team, we run Kernel testing pretty often causing energy consumption that could be saved by introducing optimizations inside the LTP testing.

In this project we use a new parallel execution implementation, in order to talk about how software creation process can save energy and CO2 emissions inside a SW company.

Goal for this Hackweek

We want to answer the following questions:

  • How many tests can run in parallel?
  • How much energy we save per LTP execution in a virtualized system such as openQA?
  • Can we improve the parallelization model to save more energy?

Resources

Jan 31

I had some issues with the runltp-ng parallel execution, due to the choice of moving UI thread in the coroutines Thread. Tests took +30% time to complete with previous code, but now UI thread is working back again. Created a script to check how many parallel executions we have for all testing suites.

Suite: can
Total tests: 3
Parallelizable tests: 2

Suite: cap_bounds
Total tests: 1
Parallelizable tests: 0

Suite: commands
Total tests: 37
Parallelizable tests: 0

Suite: connectors
Total tests: 1
Parallelizable tests: 0

Suite: containers
Total tests: 86
Parallelizable tests: 0

Suite: controllers
Total tests: 346
Parallelizable tests: 1

Suite: cpuhotplug
Total tests: 6
Parallelizable tests: 0

Suite: crashme
Total tests: 4
Parallelizable tests: 0

Suite: crypto
Total tests: 10
Parallelizable tests: 6

Suite: cve
Total tests: 77
Parallelizable tests: 5

Suite: dio
Total tests: 30
Parallelizable tests: 0

Suite: dma_thread_diotest
Total tests: 7
Parallelizable tests: 0

Suite: fcntl-locktests
Total tests: 1
Parallelizable tests: 0

Suite: filecaps
Total tests: 1
Parallelizable tests: 0

Suite: fs
Total tests: 68
Parallelizable tests: 0

Suite: fs_bind
Total tests: 95
Parallelizable tests: 0

Suite: fs_perms_simple
Total tests: 18
Parallelizable tests: 0

Suite: fs_readonly
Total tests: 55
Parallelizable tests: 0

Suite: fsx
Total tests: 1
Parallelizable tests: 0

Suite: hugetlb
Total tests: 50
Parallelizable tests: 0

Suite: hyperthreading
Total tests: 2
Parallelizable tests: 0

Suite: ima
Total tests: 9
Parallelizable tests: 0

Suite: input
Total tests: 6
Parallelizable tests: 0

Suite: io
Total tests: 2
Parallelizable tests: 1

Suite: ipc
Total tests: 8
Parallelizable tests: 0

Suite: irq
Total tests: 1
Parallelizable tests: 1

Suite: kernel_misc
Total tests: 16
Parallelizable tests: 0

Suite: kvm
Total tests: 1
Parallelizable tests: 0

Suite: ltp-aio-stress
Total tests: 54
Parallelizable tests: 0

Suite: ltp-aiodio.part1
Total tests: 140
Parallelizable tests: 0

Suite: ltp-aiodio.part2
Total tests: 83
Parallelizable tests: 0

Suite: ltp-aiodio.part3
Total tests: 48
Parallelizable tests: 0

Suite: ltp-aiodio.part4
Total tests: 57
Parallelizable tests: 0

Suite: math
Total tests: 10
Parallelizable tests: 0

Suite: mm
Total tests: 75
Parallelizable tests: 2

Suite: net.features
Total tests: 62
Parallelizable tests: 0

Suite: net.ipv6
Total tests: 11
Parallelizable tests: 0

Suite: net.ipv6_lib
Total tests: 6
Parallelizable tests: 2

Suite: net.multicast
Total tests: 4
Parallelizable tests: 0

Suite: net.nfs
Total tests: 84
Parallelizable tests: 0

Suite: net.rpc_tests
Total tests: 51
Parallelizable tests: 0

Suite: net.sctp
Total tests: 41
Parallelizable tests: 0

Suite: net.tcp_cmds
Total tests: 21
Parallelizable tests: 0

Suite: net.tirpc_tests
Total tests: 41
Parallelizable tests: 0

Suite: net_stress.appl
Total tests: 10
Parallelizable tests: 0

Suite: net_stress.broken_ip
Total tests: 11
Parallelizable tests: 0

Suite: net_stress.interface
Total tests: 25
Parallelizable tests: 0

Suite: net_stress.ipsec_dccp
Total tests: 104
Parallelizable tests: 0

Suite: net_stress.ipsec_icmp
Total tests: 86
Parallelizable tests: 0

Suite: net_stress.ipsec_sctp
Total tests: 104
Parallelizable tests: 0

Suite: net_stress.ipsec_tcp
Total tests: 104
Parallelizable tests: 0

Suite: net_stress.ipsec_udp
Total tests: 106
Parallelizable tests: 0

Suite: net_stress.multicast
Total tests: 24
Parallelizable tests: 0

Suite: net_stress.route
Total tests: 14
Parallelizable tests: 0

Suite: nptl
Total tests: 1
Parallelizable tests: 0

Suite: numa
Total tests: 20
Parallelizable tests: 2

Suite: power_management_tests
Total tests: 5
Parallelizable tests: 0

Suite: power_management_tests_exclusive
Total tests: 5
Parallelizable tests: 0

Suite: pty
Total tests: 9
Parallelizable tests: 1

Suite: s390x_tests
Total tests: 1
Parallelizable tests: 0

Suite: sched
Total tests: 11
Parallelizable tests: 0

Suite: scsi_debug.part1
Total tests: 140
Parallelizable tests: 0

Suite: securebits
Total tests: 3
Parallelizable tests: 0

Suite: smack
Total tests: 10
Parallelizable tests: 0

Suite: smoketest
Total tests: 15
Parallelizable tests: 5

Suite: staging
Total tests: 1
Parallelizable tests: 0

Suite: syscalls
Total tests: 1384
Parallelizable tests: 526

Suite: syscalls-ipc
Total tests: 61
Parallelizable tests: 26

Suite: tpm_tools
Total tests: 12
Parallelizable tests: 0

Suite: tracing
Total tests: 9
Parallelizable tests: 0

Suite: uevent
Total tests: 3
Parallelizable tests: 0

Suite: watchqueue
Total tests: 9
Parallelizable tests: 9

-------------------------------
Total tests: 4017
Parallelizable tests: 589

14.66% of the tests are parallelizable

Feb 1

Added a new option runltp-ng --force-parallel to force parallelization even if it's not enabled by tests, but using it causes application crashes, especially for more important suites such as syscalls or syscalls-ipc. Not a good idea to use it. In general, I run a few suites collecting times we need to complete them. It seems the current rule selecting tests for parallel execution is not smart enough and most of the selected tests just end in a seconds or less. This will reflect on time results, where important testing suites, such as syscalls, will end up just a few minutes before the normal execution. We can do probably better on that side by optimizing the rule, which is currently implemented here.

Qemu:
    Distro: Tumbleweed
    Kernel: 6.1.8-1-default
    SMP:    16
    RAM:    2GB

syscalls:
    tests:    1384
    parallel: 526 (38% of the tests)

    16 workers: 31m 54s
    1 worker:   36m 18s

syscalls-ipc:
    tests:    61
    parallel: 26 (42.62% of the tests)

    16 workers: 2m 4s
    1 worker:   2m 7s

mm:
    tests:    75
    parallel: 2 (42.62% of the tests)

    16 workers: 8m 2s
    1 worker:   8m 10s

cve:
    tests:    77
    parallel: 5 (6.49% of the tests)

    16 workers: 29m 53s
    1 worker:   29m 57s

02-03 Feb

I focused more on syscalls testing suites, since it's the most important suite that can be easily parallelized. All power consumption measurements have been taken using powerstat -a -R -d 0 1 3600 command, bringing data from the start of the testing suite execution until the end. All stats have been taken using my own laptop, since I wasn't able to access openQA workers physically. Also, to improve measurements, it would be better to have an external device for measuring power consumption. All tests run inside a Qemu instance. According with openQA stats, syscalls has been executed 35 times in the last month (Jan 2023), so we take this value into account.

Environment

    Laptop:
        Model:      Lenovo T14s Gen 1
        CPU:        AMD Ryzen 7 PRO 4750U
        Memory:     16GB DDR4
        Hard disk:  NVMe SSD

    Qemu:
        CPUs: 16
        RAM:  4096MB

Data

    CO2 emission per kWh    -> W = 0.244kg CO2/kWh (5% uncertainty)
    Avg idle consumption    -> I = 2.50 W
    Cost energy in germany  -> P = 0.534 $/kWh
    syscalls exec per month -> R = 35

Normal execution

    execution time:      T1 = 38m 57s = 2337s
    energy consumption:  E1 = 9 Wh
    monthly consumption: C1 = 35 * 9 = 0.315 kWh

Parallel execution (16 workers)

    execution time:      T2 = 35m 22s = 2122s -> 10% less
    energy consumption:  E2 = 10 Wh
    monthly consumption: C2 = 35 * 10 = 0.350 kWh

Results

As we notice, there's a small difference between parallelization and normal execution, but overall it's so small that it won't particularly affect CO2 emissions or costs. In particular, in one year we have:

    diff:        D = (0.315 - 0.350) * 12 = +0.42 kWh
    cost:        C = D * P = -0.42 * 0.534 = +0.224 $
    emissions: C02 = D * W = -0.42 * 0.244 = +0.102 kg

Considering that servers might consume a bit more energy during the execution, we might have bigger values, but still pretty small. The reason is that during parallelization we use more power to run many tests in parallel.

Optimizations

At the end, we can see that in terms of costs or emissions, we don't have a big impact, but in terms of time we still can have a significant impact in one year. We have the possibility to realease openQA workers in a faster way and to complete also other jobs a bit faster. And that of course will have an impact on production, energy consumption and emissions. By taking into account our data, we can say that in one year we will save:

    (T1 - T2) * R * 12 = (2337 - 2122) * 35 * 12 ~25 hours

If we are able to introduce a smarter rule to select tests which can run in parallel, the amount of saved time per year might significantly increase. Also, we still have 332 syscalls tests (about 24%) using old API which can't run in parallel nowadays.

Looking for hackers with the skills:

optimization energy kernel ltp runltp co2 testing

This project is part of:

Hack Week 22

Activity

  • about 1 year ago: mkoutny liked this project.
  • about 1 year ago: maritawerner liked this project.
  • about 1 year ago: okurz liked this project.
  • about 1 year ago: acervesato added keyword "testing" to this project.
  • about 1 year ago: acervesato added keyword "optimization" to this project.
  • about 1 year ago: acervesato added keyword "energy" to this project.
  • about 1 year ago: acervesato added keyword "kernel" to this project.
  • about 1 year ago: acervesato added keyword "ltp" to this project.
  • about 1 year ago: acervesato added keyword "runltp" to this project.
  • about 1 year ago: acervesato added keyword "co2" to this project.
  • about 1 year ago: acervesato started this project.
  • about 1 year ago: acervesato originated this project.

  • Comments

    • acervesato
      about 1 year ago by acervesato | Reply

      .

    Similar Projects

    Authenticated hashes for BTRFS by dsterba

    Project Description

    Implement a checksum ...


    Model checking the BPF verifier by shunghsiyu

    Project Description

    BPF verifier plays a ...


    early stage kdump support by mbrugger

    [comment]: # (Please use the project descriptio...