There is always something to do if you run the infrastructure for such a big project like openSUSE....

Our Admin wiki currently lists over 80 machines - and while we already "salted" some of them, there is always room for improvement and room to learn something new just by making your hands dirty and diving into the administrator role for a machine.

During this project, I like to work on services on old machines and will try to bring them into a maintainable state again.

Looking for hackers with the skills:

opensuse infrastructure

This project is part of:

Hack Week 19 Hack Week 20

Activity

  • over 3 years ago: lrupp liked this project.
  • over 3 years ago: asmorodskyi joined this project.
  • over 3 years ago: danrodriguez liked this project.
  • over 3 years ago: SLindoMansilla joined this project.
  • over 3 years ago: SLindoMansilla liked this project.
  • over 3 years ago: dgedon liked this project.
  • over 3 years ago: ories liked this project.
  • over 3 years ago: hennevogel liked this project.
  • over 4 years ago: keichwa liked this project.
  • over 4 years ago: bmwiedemann liked this project.
  • over 4 years ago: lrupp added keyword "opensuse" to this project.
  • over 4 years ago: lrupp added keyword "infrastructure" to this project.
  • over 4 years ago: lrupp started this project.
  • over 4 years ago: lrupp originated this project.

  • Comments

    • lrupp
      over 3 years ago by lrupp | Reply

      Results of day 1:

      Working on backup

      Created a new machine to host generic backups in the future. At the moment, the openSUSE infrastructure does not really have any backup, so it might be a good idea to start with it. This machine is running in the heroes infrastructure now and providing 3TB backup space for machines.

      Speeding up Gitlab

      While installing the latest GitLab updates, we noticed the following error in the unicorn.stderr.log

      Unicorn::HttpServer:0x000055cbe4834bf8>: worker (pid: 5662) exceeds memory limit (702132224.0 bytes > 485455550 bytes)

      So the Unicorn worker has not enough memory to run properly. Let's check the documentation...

      ...and adjust the environment variables via systemd: systemctl edit gitlab-ce-unicorn.service

      [Service]
      Environment="GITLABUNICORNMEMORYMAX=1073741824" "GITLABUNICORNMEMORYMIN=805306368"

      this should be enough to get a more responsive WebUI. But we did not stop here and started to work on our CI pipelines.

      In our Salt-CI, we use Docker containers to run some test scripts. Let's have a look at their test runtimes:

      • validate 1:50
      • upstream formulas show highstate 3:17
      • show highstate 3:35
      • test nginx 4:04
      • test sudo 4:38

      So we have a total runtime for all tests of: 17:24 min. A very long time!

      One culprint was, that our CI did not define a specialized container. So it felt back to the standard container, which is defined in the Gitlab-Runner config: registry.opensuse.org/opensuse/leap:15.2. As this base image does neither include the packages needed for testing nor additional repositories, our script to prepare the test environment was required to add these repositories and install the missing packages (~85 RPMs) again and again for each test run.

      With the help of Darix, we were able to setup two new Container images in our openSUSE:infrastructure repository:

      • container-heroes-base - we based this on the openSUSE Leap 15.2 container from above, but already added our own repository and the package with our internal certificates. This image can be the base for any further images we want to run in our infrastructure.
      • container-heroes-salt-testing - This container uses the latest heroes-base container and installs the packages that were installed by the preparation script before.

      Together with the following change in our ~/.gitlab-ci.yml: image: registry.opensuse.org/opensuse/infrastructure/containers/heroes-salt-testing:latest we were able to reduce the test runtime to:

      • validate 0:50
      • upstream formulas show highstate 2:06
      • show highstate 1:43
      • test nginx 2:49
      • test sudo 3:34

      Which reduced the runtime to 11:02 min (~63%).

      With more specialized Docker images for the nginx and sudo tests, we should be able to reduce the time even more (below 10min?), but the next bigger step might be to distribute as many tests as possible and run them in parallel.

      Last but not least, defining:

      variables:
      DOCKER_DRIVER: overlay2

      in our ~/.gitlab-ci.yml resulted in a few seconds less, but this might not be that relevant.

    • lrupp
      over 3 years ago by lrupp | Reply

      Results of day 2:

      15.3 upgrades

      Did you know that we already reached the Beta phase of our glorious, "conservative" part of the openSUSE distribution?

      15.3 is knocking at the door and some of the openSUSE infrastructure is already using it!

      Our status page, some of our static pages, one DNS server and some more machines have already been migrated from 15.2 to 15.3 add-emoji

      Workflow (if the 'baseurl' in your '*.repo' files below /etc/zypp/repos.d/ contains '$releasever' in the URL):

       zypper clean -a
       zypper --releasever 15.3 ref
       zypper --releasever 15.3 dup --allow-vendor-change
       cd /etc/products.d/ && ln -sf Leap.prod baseproduct
       zypper --non-interactive purge-kernels && grub2-mkconfig -o /boot/grub2/grub.cfg
       reboot
      

      Some stuff above is needed because there are still some rogue edges in the Beta version, but in general everything worked and runs smoothly.

      DNSSec

      ~10 years ago (WOW!), we decided to use FreeIPA to manage the openSUSE hero admin accounts as well as the DNS domain. Until today, the (meanwhile ~80 openSUSE heroes) manage 31 different DNS domains. Some of them are internal, but many of them are public domains, somehow related to openSUSE.

      As we want to enable DNSSec, it is time to review the setup and check what needs to be done:

      • split out the DNS management from the old FreeIPA installation (go with a "keep it simple" approach)
      • our internal, hidden master should learn to manage DNSSec - best case with little to no effort for us
      • provide a nice and shiny WebUI for those, who don't like vi but need to update DNS records

      We searched around and came to the conclusion that we will keep our internal DNS master running on PowerDNS, but extend it with PowerDNS-Admin, a nice WebUI for managing the PowerDNS server and it's zones. As the WebUI currently requires some more current Python packages than those which are available on Leap 15.3, we decided to create a dedicated repository in OBS for it.

      Thanks to pdns-util and Darix' console Kung-Fu, all DNS zones have been migrated away from FreeIPA in nearly no time.

      So it looks like we are just one click away from enabling DNSSec for the opensuse.org DNS zone on the PowerDNS side. Sadly we ran out of time this week to work further on this. But we will start the first DNSSec tests in the next days with some of the domains which are not used at the moment. Once this looks fine, we will try to utilize the API of our Registrar to exchange the signing keys automatically. So there might be some bumby road ahead (depending on the permission to access the API), but we are a big step forward to get DNSSec implemented this year.

    Similar Projects

    Create openSUSE images for Arm/RISC-V boards by avicenzi

    [comment]: # (Please use the project descriptio...