Since SUSE Manager doesn't scale out and stacking it into another pyramid of susemanagers won't help here, the real architectural changes needs to be done to achieve true scale-out of this product. This hackweek project is about how to Turn SUSE Manager into a cluster.

Areas to be tackled:

  1. Distributed FS for storage
  2. Distributed messaging bus
  3. Distributed KV for metadata
  4. Control Node prototype (this component turns a SUSE Manager into a stateless-ish event-driven node, where SUSE Manager in a future is deployed into a container -> Kubernetes)
  5. Cluster Director prototype (this component orchestrates the entire cluster via events)
  6. API Gateway, providing 100% compatible API across all SUSE Manager nodes (this is done, demoed at SUMA Winter Summit 2020)
  7. Slightly modified SUSE Manager peripherals (reposync, DB etc).
  8. Client daemon, which is used to bind a single registered system to a cluster node according to Cluster Director service (on shrink, grow, rebalance and disaster recovery events).

The idea is to solve a set of core problems how to turn technologically outdated Uyuni Server / SUMA into a modern cloud-native cluster node, so with after a reasonable time this could be turned into a real product for SUSE Manager.


Progress

Day One (Monday, 10 Feb)

Storm seems over. Realised this website doesn't allow editing comments :astonished:.

Got working initial Cluster Daemon that runs REST API and talks to the distributed KV database.

Got working Client Daemon for every Client System. So far it can:

  • Pool the Client System to the cluster for staging
  • Talk to Cluster Director (CD) and get the status
  • Switch/reconfigure Salt Minion according to the CD's directives

Got initial tool (Python) that calls Cluster API. So far it is too simple to describe. add-emoji

Got initial Cluster Director with OpenAPI spec (Swagger) running.

TODO:

  • [x] Running distributed K/V database
    • [x] Verify it is buildable/package-able
  • [x] Running distributed message bus
    • [x] Verify it is buildable/package-able
  • [x] Manage cluster Zones
  • [x] mgr-clbd-admin tool
    • [x] List nodes
    • [x] List zones
    • [x] Format JSON input
    • [x] Call API via JSON input
  • [x] Add OpenAPI to Cluster Director daemon
    • [x] Swagger UI running
    • [x] APIs are automatically generated (updated Makefile)
    • [x] I can try that in browser
  • [x] Describe all the APIs I've got so far

The day is over. Did many refactorings, found that Gin doesn't do form parsing from the body on DELETE method but solved it, not yet haven't finished Zones management (which is damn easy now, but argghh still!).


Day Two (Tuesday, 11 Feb)

Turned mgr-clbd-admin into a repo subproject of the Cluster Director daemon. There should be a set of common Jinja-based formatters for those common return types, but that's "bells-n-whistles" I will take care later. Right now a raw JSON dump is good 'nuff.

Update (15:20): Group photo taken, many refactorings, Zones management done. Time to write Node Controller.

Done (partially):

  • [*] Node Controller (initial)
    • [x] "Wrap around head" overall code design
    • [x] SSH communication over RSA keypair to the staging Cluster Node
    • [x] Bi-directional pub/sub (initial)
    • [*] Configuration file
    • [x] SSH check remote host
    • [x] SSH disable host verification option
    • [*] Events are emitted from an arbitrary SUSE Manager to the Cluster bus via Node Controller (simple-n-stupid PoC ATM)
    • [*] Commands from Cluster Directory received and mapped to the emitter facility (write one for XML-RPC APIs)
    • [*] Cluster Node staging
      • [x] Initial overall PoC code
      • [x] Execute nanostates[1]
        • [x] SSH Runner runs a nanostate on a remote machine
        • [x] Refactor SSH Runner output type before it is too late. It is complex enough to be very bad as map[interface{}]interface{}.
        • [x] Implement local runner (for a client on localhost)
      • [*] Write few nanostate scenarios to complete Node staging:
        • [*] Reset/prepare PostgreSQL database
        • [*] Prepare/mount distributed File System

NodeController is about to execute nanostates. It is like "nano-ansible" in the pocket, a fusion of Salt and Ansible ideas into a small package, which is not intended to be as broad CMS as those two. Essentially, it just runs a series of commands over SSH to a specific release of supported Cluster Node (AKA SUSE Manager) and does some rudimentary "things" on it, once it was installed and setup. These are like getting machine-id, backing up some configuration files, making standard operations with the database, start/stop/restart some services etc: all what can be done just via plain command line and mostly used for informational purposes.

This alone removes any kind of need of internal configuration management system to just stage a Cluster Node and add it to the swarm.


Day Three (Wednesday, 12 Feb)

An Unwanted Accident... (almost)

While working on a Cluster Node code for staging and playing with SSH sessions and channels, I just accidentally "rewrote" Salt, combining both best practices of Salt and Ansible. Of course, it is far-far-far-far away from what Salt can do today.

Or... is it? add-emoji Let's see.

So the main plan was to manage cluster nodes and their components and nothing else. For that nobody needs a full-blown configuration management infrastructure, right? And so it happens that Cluster since has a Cluster Director (that supposed to scale-out on its own), but it is essentially just like that woulld be a Salt Master. And, consequently, since it is hooked up to an abstract Message Bus (you write an adapter connector for Kafka if you need many millions, but so far I am sure NATS will do for millions of #susemanagers), it talks to a Node Controllers on each #susemanager node, which is... right, an analogy to a Salt Minion. And here you go: bi-directional pub/sub that can perform something when a particular message arrives.

Security? Everything is happening over TLS anyway. But then another layer is based on pure OpenSSL: whatever Cluster Director needs to pass secretly to a specific Cluster Node, it sends to a channel, where every message is also encrypted on its own. Each Cluster Node is subscribed to a TWO channels:

  1. General public
  2. Private (by key fingerprint)

Cluster Director sends everything plain text to a public channel, but secret communication is running over private channel, encrypted with the public key of the recepient. Complexity of returners, pillars etc is no longer needed in this way.

It didn't took me long to craft a simple architecture to foresee even embedding Lua or Starlight into a state system. In fact, it would be even better than Salt, because one wouldn't have if/else imperative clutter in the declarative state (!). How a state looks like? Currently so (again, it is a hackweek of one solo "cowboy", not even pre-alpha):

id: some-test
description: This is a test state
state:
  gather-machine-summary:
    - shell:
        - get-id: "cat /etc/machine-id"
        - uptime: "uptime"
        - hostname: "hostname"

The shell is a module I just wrote. It takes series of the commands and runs it, returning me something like this:

{
    "get-id": "12e43783e54f25bb3f505cfeeff94045",
    "upteime": "13:54:07 up 18:59,  1 user,  load average: 0,10, 0,18, 0,17",
    "hostname": "rabbit"
}

It also happens that all the above can be ran locally or remotely. Or one-to-many remotely over SSH.

But... Wait a sec. Only $SHELL? Can it do a bit more than this? Wait-wait. So if I can run arbitrary stuff already (which Ansible and Salt are anyway), then what stops me to call pure Ansible binary modules, and just access all that pile of crazy modules they've got already working? Nothing! Just scp them there and stockpile on the client. In fact, just install the entire Ansible and run it there as is. It is as same as Saltsible works, after all.

Basically, I ended up with a message-driven cluster architecture that happens to be compatible with Ansible modules. Kind of. Not yet completely, but to bring that to 100% compatibility is no brainer, just need few more extra days to get done. Which not my goal and priority at the moment anyway.

So and then imagine if having Lua or Starlight (Python dialect) embedded, you could do the above another way:

import:
  - ssh_functions
id: some-test
description: This is a test state
state:
  gather-machine-summary:
    - shell: 
      - {get_id()}
      - {get_uptime()}
      - {get_hostname()}

These would be a functions somewhere in a file ssh_functions.lua. The same functions could do also this:

import:
  - ssh_functions
id: some-test
description: This is a test state
state:
  create-user:
    - shell: 
      - {add_user(uid=get_salt_pillar("uid"))}

Well, you've got the idea. Anyway, I will focus on unchecked boxes above from yesterday and will finish this working at least anything. And thus won't extrapolate this up to eleven. At least not at the moment.

Update (17:40) Nanostates are happily running passed on scenarios on remote machines add-emoji Sort of declarative orchestration. Not yet asynchronous. Few minutes left, maybe I will implement local runner?.. add-emoji add-emoji

Update (somewhere evening) Refactored runners and implemented SSH runner as well as local runner. Nah, but the possibility of running Ansible modules FOR FREE is still bugging me! add-emoji Instead of reimplementing PostgreSQL start/stop and prepare/mount distributed file system in a shell command line, how much time it will take to hook the whole Ansible modular system and use it from the nanostates? It is a Hackweek, after all! add-emoji


Day Four (Thursday, 13 Feb)

What one can do in basically four days, having almost nothing? A lot! add-emoji So far, my leftovers from yesterday:

TODO:

  • [x] Integrate Ansible
    • [x] Runs binary modules
    • [x] Runs Python modules
  • [ ] Write few nanostate scenarios to complete Node staging:
    • [ ] Reset/prepare PostgreSQL database
    • [ ] Prepare/mount distributed File System
  • [ ] Cluster Node staging
    • [ ] Integrate staging part together with the Node Controller
  • [ ] Node Controller (initial)
    • [x] Configuration file
    • [x] PostgreSQL event emitter
    • [x] Events are emitted from an arbitrary SUSE Manager to the Cluster bus via Node Controller (simple-n-stupid PoC ATM)
    • [ ] Commands from Cluster Directory received and mapped to the emitter facility (write one for XML-RPC APIs)

OK, well... Ansible would be certainly a right next step to look at, but ATM I'd rather save time and focus on emitting messages from the PostreSQL database, which is deep inside SUSE Manager's guts. So toss-in few basic shell commands for Node staging and that's it.

Update (12:00) SCNR...

Took this official Ansible module. Then wrote a nanostate snippet:

- ansible.helloworld:
    name: "Cluster"

Result:

    {
      "Module": "ansible.helloworld",
      "Errcode": 0,
      "Errmsg": "",
      "Response": [
        {
          "Host": "localhost",
          "Response": {
            "ansible.helloworld": {
              "Stdout": "",
              "Stderr": "",
              "Errmsg": "",
              "Errcode": 0,
              "Json": {
                "changed": false,
                "failed": false,
                "msg": "Hello, Cluster!"
              }
            }
          }
        }
      ]
    }

Of course, this inherited Ansible's main illness: dont_run_this_twice.yaml. Calling nanostates nanostates is too loud at the moment: they won't check the state, but just fire whatever in them "into the woods". But the goal of the project nor to write another Configuration Management, n̶e̶i̶t̶h̶e̶r̶ ̶s̶c̶a̶l̶e̶-̶o̶u̶t̶ ̶A̶n̶s̶i̶b̶l̶e̶ (oops, that just happened unplanned), neither to fix Ansible imperative behaviour and build around it declarative runners (which is not really a problem, BTW).

Oh well. Fun. Now messaging bus story: Postgres, here I come!

Update (somewhere evening) PostgreSQL happily spitting out every changes to its tables through whatever way in Uyuni Server. The XML-RPC APIs are very slow, on the other hand. I was exploring ways how to implement plugins in Go, so then I don't have to bundle everything into one binary. The gRPC way is the only reliable and nicely decouple-able. The "native" Go plugins are an interesting tech preview, working nice (as long as the same $GOPATH and the same compiler is used) but sadly they seems still quite far away from production status. Plugins supposed to be written by different vendors, which is not the case right now seems to be supported.

I am right now solving problems how the Node Controller will reconcile network transaction across the entire cluster, making 100% sure all-or-none nodes has been updated. As always, there are several ways of doing it, but I have to find out which one suits best.


The Last Day of The Hackweek (Friday, 14 Feb)

At least it isn't Friday 13. Starting from touch main.go, so far what I've got per these days:

Someone Did It

But I chose it and put it together. I chose that, because I can also support it and bugfix it.

  • Running equivalent to etcd, which scales out way better than etcd. Check out TiKV. If you know Rust, you will have lots of fun.
  • Running MySQL compatibility layer on top of it. Performance is about 10x times slower than MySQL's InnoDB, but in this case performance isn't an issue at all. Important that this thing scales out infinitely, just a bit of space hungry. Check out TiDB
  • Running distributed storage and mountable filesystem. If SES guys will one day support "SUSE Manager on Ceph nodes", it will be just fantastic. Until then — other solutions. Check out SeaweedFS and IPFS. The IPFS is running Tumbleweed repo at SUSE.
  • Running message bus that supposed to scale out same as Apache Kafka does. The reason not to use Apache Kafka is very trivial: its infrastructure is much harder to maintain. But this is not a reason and so hard infrastructure maintenance on its own does not rules Kafka out! You want it? No problemo: — just add another adapter to Apache Kafka and replace with currently used NATS. In fact, NATS perfectly co-exists with Kafka in some infrastructures. Check out NATS

I Did It

  • Running Client System Daemon (i.e. "runs on registered client system") which main basic role to ask Cluster what node to use, automatically reconfigure Salt Minion and other configuration and then re-point client system to a new Cluster Node (AKA Uyuni Server), if that is needed. It also recovers client system back to the cluster, once Cluster Node puffed in smokes.
  • "one to many and many to one" API Gateway, which allows spacecmd and similar tools to "just work" across multiple nodes. Granted, it wasn't written during this Hackweek and is 99.999% compatible (I was too lazy to get back and implement overloaded XML-RPC signatures for REST, as well I am returning nil instead of an empty dictionary — probably a bug, but... meh... later). This thing also runs Swagger UI for OpenAPI specs against all XML-RPC API for SUSE Manager. add-emoji
  • Very basic Cluster Director that can manage zones in and add cluster nodes. It as well runs OpenAPI/Swagger UI. Very basic, because it has no features yet. But doesn't mean it doesn't have more-less solid architecture.
  • Library that runs Ansible in Salt fashion (via bi-directional pub/sub, which rules out returners/pillars as unnecessary). I am going to use that internally instead of both Salt and Ansible on their own. Again, it is simpler to call Ansible module reusing existing scaled out infrastructure, rather then run-and-take-care-of yet another components. And I don't have to maintain Ansible: it is perfectly tested anyway.
  • Library that resembles SaltSSH by running Ansible modules (both Python and binary). I am considering I've done it, because I could (doesn't mean I should).
  • Very unfinished Initial Node Controller Daemon, which listens to Uyuni Server events and emits messages to the bus for further operations.

Phew. Not bad as for basically four days, I'd say! All that stuff I wrote in Go. I'd say it does make sense to use that language, if you don't want to write Java or Python.

What are my nearest plans?

  • Finish the "loop" and have all components running, talking to each other, client nodes are transfered seamlessly.
  • Achieve network transaction on updating Cluster Nodes.
  • Write some Ansible modules, likely in plain old C and Rust, add their caching on the client so it will perform well, add seamless module updates. Generic Ansible doesn't hurt, but I don't need it for Cluster needs.
  • Modular/pluggable system in Go, so this whole project can be adaptable to other products, not just SUSE Manager.

Presentation Slides

I've put an outline together all that into my Google Drive. Enjoy.

...and stay tuned add-emoji

Looking for hackers with the skills:

distributedsystems cluster cloud kubernetes golang go rust

This project is part of:

Hack Week 19

Activity

  • almost 5 years ago: keichwa liked this project.
  • almost 5 years ago: ktsamis liked this project.
  • almost 5 years ago: bmaryniuk added keyword "rust" to this project.
  • almost 5 years ago: bmaryniuk added keyword "golang" to this project.
  • almost 5 years ago: bmaryniuk added keyword "go" to this project.
  • almost 5 years ago: pagarcia liked this project.
  • almost 5 years ago: bmaryniuk added keyword "kubernetes" to this project.
  • almost 5 years ago: bmaryniuk added keyword "distributedsystems" to this project.
  • almost 5 years ago: bmaryniuk added keyword "cluster" to this project.
  • almost 5 years ago: bmaryniuk added keyword "cloud" to this project.
  • almost 5 years ago: bmaryniuk started this project.
  • almost 5 years ago: bmaryniuk originated this project.

  • Comments

    • bmaryniuk
      almost 5 years ago by bmaryniuk | Reply

      Day One

      TODO:

      • [x] Running distributed K/V database
        • [x] Verify it is buildable/package-able
      • [x] Running distributed message bus
        • [x] Verify it is buildable/package-able
      • [ ] Manage cluster Zones
      • [ ]

      Summary

      Got running Client Daemon. It can:

      • Talk to Cluster Director (CD) and ask for status
      • Switch/reconfigure Salt Minion according to the CD's directives

    • keichwa
      almost 5 years ago by keichwa | Reply

      Yes, and presentation slides!

    Similar Projects

    Save pytorch models in OCI registries by jguilhermevanz

    Description

    A prerequisite for running applications in a cloud environment is the presence of a container registry. Another common scenario is users performing machine learning workloads in such environments. However, these types of workloads require dedicated infrastructure to run properly. We can leverage these two facts to help users save resources by storing their machine learning models in OCI registries, similar to how we handle some WebAssembly modules. This approach will save users the resources typically required for a machine learning model repository for the applications they need to run.

    Goals

    Allow PyTorch users to save and load machine learning models in OCI registries.

    Resources


    Mortgage Plan Analyzer by RMestre

    https://github.com/rjpmestre/mortgage-plan-analyzer

    Project Description

    Many people face challenges when trying to renegotiate their mortgages with different banks. They receive offers from multiple lenders and struggle to compare them effectively. Each proposal may have slightly different terms and data presentation, making it hard to make informed decisions. Additionally, understanding the impact of various taxes and variables can be complex. The Mortgage Plan Analyzer project aims to address these issues.

    Project Overview:

    The Mortgage Plan Analyzer is a web-based tool built using PHP, Laravel, Livewire, and AdminLTE/bootstrap. It provides a user-friendly platform for individuals to input basic specifications about their mortgage, adjust taxes and variables, and obtain short-term projections for each proposal. Users can also compare multiple mortgage offers side by side, enabling them to make informed decisions about their mortgage renegotiation.

    Why Start This Project:

    I found myself in this position and most tools I found around are either for marketing/selling purposes or not flexible enough. As i was starting getting lost in a jungle of spreadsheets i thought I could just create a tool to help me and others that may be experiencing the same struggles to provide clarity and transparency in the decision-making process.

    Hackweek 25 ideas (to refine still :) )

    • Euribor Trends in Projections
    • - Use historical Euribor data to model optimistic and pessimistic scenarios for variable-rate loans.
    • Use the annual summaries (installments, amortizations, etc) and run some analysis to highlight key differences, like short-term savings vs. long-term costs
    • Financial plan can be hard/boring to follow. Create a simple viewing mode that summarizes monthly values and their annual sums.

    Hackweek 24 update

    • Improved summaries graphs by adding:
    • - Line graph;
    • - Accumulated line graph;
    • - Set the range to short/mid/long term;
    • - Highlight best simulation and value per year;
    • Improve the general behaviour of the forms:
    • - Simulations name setting;
    • - Cloning simulations;
    • - Adjust update timing on input changes;
    • Show/Hide big tables;
    • Support multi languages (added english);
    • Added examples;
    • Adjustments to fonts and sizes;
    • Fixed loading screen;
    • Dependencies adjustments;

    Hackweek 23 initial release

    • Developed a base site that:
    • - Allows adding up to 3 simulations;
    • - Create financial plans;
    • - Simulations comparison graph for the first 4 years;
    • Created Github project @ https://github.com/rjpmestre/mortgage-plan-analyzer ;
    • Launched a demo instance using Oracle Cloud Free Tier currently @ http://138.3.251.182/

    Resources

    • Banco de Portugal: Main simulator all portuguese banks have to follow ( https://clientebancario.bportugal.pt/credito-habitacao )
    • Laravel: A PHP web application framework for building robust and scalable applications. ( https://laravel.com/ )
    • Livewire: A Laravel library for building dynamic interfaces without writing JavaScript. ( https://livewire.laravel.com/ )
    • AdminLTE: A responsive admin dashboard template for creating a visually appealing interface. ( https://adminlte.io/ )


    Install Uyuni on Kubernetes in cloud-native way by cbosdonnat

    Description

    For now installing Uyuni on Kubernetes requires running mgradm on a cluster node... which is not what users would do in the Kubernetes world. The idea is to implement an installation based only on helm charts and probably an operator.

    Goals

    Install Uyuni from Rancher UI.

    Resources


    Learn enough Golang and hack on CoreDNS by jkuzilek

    Description

    I'm implementing a split-horizon DNS for my home Kubernetes cluster to be able to access my internal (and external) services over the local network through public domains. I managed to make a PoC with the k8s_gateway plugin for CoreDNS. However, I soon found out it responds with IPs for all Gateways assigned to HTTPRoutes, publishing public IPs as well as the internal Loadbalancer ones.

    To remedy this issue, a simple filtering mechanism has to be implemented.

    Goals

    • Learn an acceptable amount of Golang
    • Implement GatewayClass (and IngressClass) filtering for k8s_gateway
    • Deploy on homelab cluster
    • Profit?

    Resources

    EDIT: Feature mostly complete. An unfinished PR lies here. Successfully tested working on homelab cluster.


    kubectl clone: Seamlessly Clone Kubernetes Resources Across Multiple Rancher Clusters and Projects by dpunia

    Description

    kubectl clone is a kubectl plugin that empowers users to clone Kubernetes resources across multiple clusters and projects managed by Rancher. It simplifies the process of duplicating resources from one cluster to another or within different namespaces and projects, with optional on-the-fly modifications. This tool enhances multi-cluster resource management, making it invaluable for environments where Rancher orchestrates numerous Kubernetes clusters.

    Goals

    1. Seamless Multi-Cluster Cloning
      • Clone Kubernetes resources across clusters/projects with one command.
      • Simplifies management, reduces operational effort.

    Resources

    1. Rancher & Kubernetes Docs

      • Rancher API, Cluster Management, Kubernetes client libraries.
    2. Development Tools

      • Kubectl plugin docs, Go programming resources.

    Building and Installing the Plugin

    1. Set Environment Variables: Export the Rancher URL and API token:
    • export RANCHER_URL="https://rancher.example.com"
    • export RANCHER_TOKEN="token-xxxxx:xxxxxxxxxxxxxxxxxxxx"
    1. Build the Plugin: Compile the Go program:
    • go build -o kubectl-clone ./pkg/
    1. Install the Plugin: Move the executable to a directory in your PATH:
    • mv kubectl-clone /usr/local/bin/

    Ensure the file is executable:

    • chmod +x /usr/local/bin/kubectl-clone
    1. Verify the Plugin Installation: Test the plugin by running:
    • kubectl clone --help

    You should see the usage information for the kubectl-clone plugin.

    Usage Examples

    1. Clone a Deployment from One Cluster to Another:
    • kubectl clone --source-cluster c-abc123 --type deployment --name nginx-deployment --target-cluster c-def456 --new-name nginx-deployment-clone
    1. Clone a Service into Another Namespace and Modify Labels:


    Harvester Packer Plugin by mrohrich

    Description

    Hashicorp Packer is an automation tool that allows automatic customized VM image builds - assuming the user has a virtualization tool at their disposal. To make use of Harvester as such a virtualization tool a plugin for Packer needs to be written. With this plugin users could make use of their Harvester cluster to build customized VM images, something they likely want to do if they have a Harvester cluster.

    Goals

    Write a Packer plugin bridging the gap between Harvester and Packer. Users should be able to create customized VM images using Packer and Harvester with no need to utilize another virtualization platform.

    Resources

    Hashicorp documentation for building custom plugins for Packer https://developer.hashicorp.com/packer/docs/plugins/creation/custom-builders

    Source repository of the Harvester Packer plugin https://github.com/m-ildefons/harvester-packer-plugin


    Mammuthus - The NFS-Ganesha inside Kubernetes controller by vcheng

    Description

    As the user-space NFS provider, the NFS-Ganesha is wieldy use with serval projects. e.g. Longhorn/Rook. We want to create the Kubernetes Controller to make configuring NFS-Ganesha easy. This controller will let users configure NFS-Ganesha through different backends like VFS/CephFS.

    Goals

    1. Create NFS-Ganesha Package on OBS: nfs-ganesha5, nfs-ganesha6
    2. Create NFS-Ganesha Container Image on OBS: Image
    3. Create a Kubernetes controller for NFS-Ganesha and support the VFS configuration on demand. Mammuthus

    Resources

    NFS-Ganesha


    OpenQA Golang api client by hilchev

    Description

    I would like to make a simple cli tool to communicate with the OpenQA API

    Goals

    • OpenQA has a ton of information that is hard to get via the UI. A tool like this would make my life easier :)
    • Would potentially make it easier in the future to make UI changes without Perl.
    • Improve my Golang skills

    Resources

    • https://go.dev/doc/
    • https://openqa.opensuse.org/api


    kubectl clone: Seamlessly Clone Kubernetes Resources Across Multiple Rancher Clusters and Projects by dpunia

    Description

    kubectl clone is a kubectl plugin that empowers users to clone Kubernetes resources across multiple clusters and projects managed by Rancher. It simplifies the process of duplicating resources from one cluster to another or within different namespaces and projects, with optional on-the-fly modifications. This tool enhances multi-cluster resource management, making it invaluable for environments where Rancher orchestrates numerous Kubernetes clusters.

    Goals

    1. Seamless Multi-Cluster Cloning
      • Clone Kubernetes resources across clusters/projects with one command.
      • Simplifies management, reduces operational effort.

    Resources

    1. Rancher & Kubernetes Docs

      • Rancher API, Cluster Management, Kubernetes client libraries.
    2. Development Tools

      • Kubectl plugin docs, Go programming resources.

    Building and Installing the Plugin

    1. Set Environment Variables: Export the Rancher URL and API token:
    • export RANCHER_URL="https://rancher.example.com"
    • export RANCHER_TOKEN="token-xxxxx:xxxxxxxxxxxxxxxxxxxx"
    1. Build the Plugin: Compile the Go program:
    • go build -o kubectl-clone ./pkg/
    1. Install the Plugin: Move the executable to a directory in your PATH:
    • mv kubectl-clone /usr/local/bin/

    Ensure the file is executable:

    • chmod +x /usr/local/bin/kubectl-clone
    1. Verify the Plugin Installation: Test the plugin by running:
    • kubectl clone --help

    You should see the usage information for the kubectl-clone plugin.

    Usage Examples

    1. Clone a Deployment from One Cluster to Another:
    • kubectl clone --source-cluster c-abc123 --type deployment --name nginx-deployment --target-cluster c-def456 --new-name nginx-deployment-clone
    1. Clone a Service into Another Namespace and Modify Labels:


    Learn enough Golang and hack on CoreDNS by jkuzilek

    Description

    I'm implementing a split-horizon DNS for my home Kubernetes cluster to be able to access my internal (and external) services over the local network through public domains. I managed to make a PoC with the k8s_gateway plugin for CoreDNS. However, I soon found out it responds with IPs for all Gateways assigned to HTTPRoutes, publishing public IPs as well as the internal Loadbalancer ones.

    To remedy this issue, a simple filtering mechanism has to be implemented.

    Goals

    • Learn an acceptable amount of Golang
    • Implement GatewayClass (and IngressClass) filtering for k8s_gateway
    • Deploy on homelab cluster
    • Profit?

    Resources

    EDIT: Feature mostly complete. An unfinished PR lies here. Successfully tested working on homelab cluster.


    suse-rancher-supportconfig by eminguez

    Description

    Update: Live at https://github.com/e-minguez/suse-rancher-supportconfig I finally didn't used golang but used gum instead add-emoji

    SUSE's supportconfig support tool collects data from the SUSE Operating system. Rancher's rancher2_logs_collector.sh support tool does the same for RKE2/K3s.

    Wouldn't be nice to have a way to run both and collect all data for SUSE based RKE2/K3s clusters? Wouldn't be even better with a fancy TUI tool like bubbletea?

    Ideally the output should be an html page where you can see the logs/data directly from the browser.

    Goals

    • Familiarize myself with both supportconfig and rancher2_logs_collector.sh tools
    • Refresh my golang knowledge
    • Have something that works at the end of the hackweek ("works" may vary add-emoji )
    • Be better in naming things

    Resources

    All links provided above as well as huh


    toptop - a top clone written in Go by dshah

    Description

    toptop is a clone of Linux's top CLI tool, but written in Go.

    Goals

    Learn more about Go (mainly bubbletea) and Linux

    Resources

    GitHub


    Cluster API Provider for Harvester by rcase

    Project Description

    The Cluster API "infrastructure provider" for Harvester, also named CAPHV, makes it possible to use Harvester with Cluster API. This enables people and organisations to create Kubernetes clusters running on VMs created by Harvester using a declarative spec.

    The project has been bootstrapped in HackWeek 23, and its code is available here.

    Work done in HackWeek 2023

    • Have a early working version of the provider available on Rancher Sandbox : *DONE *
    • Demonstrated the created cluster can be imported using Rancher Turtles: DONE
    • Stretch goal - demonstrate using the new provider with CAPRKE2: DONE and the templates are available on the repo

    Goals for HackWeek 2024

    • Add support for ClusterClass
    • Add e2e testing
    • Add more Unit Tests
    • Improve Status Conditions to reflect current state of Infrastructure
    • Improve CI (some bugs for release creation)
    • Testing with newer Harvester version (v1.3.X and v1.4.X)
    • Due to the length and complexity of the templates, maybe package some of them as Helm Charts.
    • Other improvement suggestions are welcome!

    DONE in HackWeek 24:

    Thanks to @isim and Dominic Giebert for their contributions!

    Resources

    Looking for help from anyone interested in Cluster API (CAPI) or who wants to learn more about Harvester.

    This will be an infrastructure provider for Cluster API. Some background reading for the CAPI aspect:


    A CLI for Harvester by mohamed.belgaied

    [comment]: # Harvester does not officially come with a CLI tool, the user is supposed to interact with Harvester mostly through the UI [comment]: # Though it is theoretically possible to use kubectl to interact with Harvester, the manipulation of Kubevirt YAML objects is absolutely not user friendly. [comment]: # Inspired by tools like multipass from Canonical to easily and rapidly create one of multiple VMs, I began the development of Harvester CLI. Currently, it works but Harvester CLI needs some love to be up-to-date with Harvester v1.0.2 and needs some bug fixes and improvements as well.

    Project Description

    Harvester CLI is a command line interface tool written in Go, designed to simplify interfacing with a Harvester cluster as a user. It is especially useful for testing purposes as you can easily and rapidly create VMs in Harvester by providing a simple command such as: harvester vm create my-vm --count 5 to create 5 VMs named my-vm-01 to my-vm-05.

    asciicast

    Harvester CLI is functional but needs a number of improvements: up-to-date functionality with Harvester v1.0.2 (some minor issues right now), modifying the default behaviour to create an opensuse VM instead of an ubuntu VM, solve some bugs, etc.

    Github Repo for Harvester CLI: https://github.com/belgaied2/harvester-cli

    Done in previous Hackweeks

    • Create a Github actions pipeline to automatically integrate Harvester CLI to Homebrew repositories: DONE
    • Automatically package Harvester CLI for OpenSUSE / Redhat RPMs or DEBs: DONE

    Goal for this Hackweek

    The goal for this Hackweek is to bring Harvester CLI up-to-speed with latest Harvester versions (v1.3.X and v1.4.X), and improve the code quality as well as implement some simple features and bug fixes.

    Some nice additions might be: * Improve handling of namespaced objects * Add features, such as network management or Load Balancer creation ? * Add more unit tests and, why not, e2e tests * Improve CI * Improve the overall code quality * Test the program and create issues for it

    Issue list is here: https://github.com/belgaied2/harvester-cli/issues

    Resources

    The project is written in Go, and using client-go the Kubernetes Go Client libraries to communicate with the Harvester API (which is Kubernetes in fact). Welcome contributions are:

    • Testing it and creating issues
    • Documentation
    • Go code improvement

    What you might learn

    Harvester CLI might be interesting to you if you want to learn more about:

    • GitHub Actions
    • Harvester as a SUSE Product
    • Go programming language
    • Kubernetes API


    Contribute to terraform-provider-libvirt by pinvernizzi

    Description

    The SUSE Manager (SUMA) teams' main tool for infrastructure automation, Sumaform, largely relies on terraform-provider-libvirt. That provider is also widely used by other teams, both inside and outside SUSE.

    It would be good to help the maintainers of this project and give back to the community around it, after all the amazing work that has been already done.

    If you're interested in any of infrastructure automation, Terraform, virtualization, tooling development, Go (...) it is also a good chance to learn a bit about them all by putting your hands on an interesting, real-use-case and complex project.

    Goals

    • Get more familiar with Terraform provider development and libvirt bindings in Go
    • Solve some issues and/or implement some features
    • Get in touch with the community around the project

    Resources


    ddflare: (Dynamic)DNS management via Cloudflare API in Kubernetes by fgiudici

    Description

    ddflare is a project started a couple of weeks ago to provide DDNS management using v4 Cloudflare APIs: Cloudflare offers management via APIs and access tokens, so it is possible to register a domain and implement a DynDNS client without any other external service but their API.

    Since ddflare allows to set any IP to any domain name, one could manage multiple A and ALIAS domain records. Wouldn't be cool to allow full DNS control from the project and integrate it with your Kubernetes cluster?

    Goals

    Main goals are:

    1. add containerized image for ddflare
    2. extend ddflare to be able to add and remove DNS records (and not just update existing ones)
    3. add documentation, covering also a sample pod deployment for Kubernetes
    4. write a ddflare Kubernetes operator to enable domain management via Kubernetes resources (using kubebuilder)

    Available tasks and improvements tracked on ddflare github.

    Resources

    • https://github.com/fgiudici/ddflare
    • https://developers.cloudflare.com/api/
    • https://book.kubebuilder.io


    Hack on rich terminal user interfaces by amanzini

    Description

    TUIs (Textual User Interface) are a big classic of our daily workflow. Many linux users 'live' in the terminal and modern implementations have a lot to offer : unicode fonts, 24 bit colors etc.

    Goals

    • Explore the current available solution on modern languages and implement a PoC , for example a small maze generator, porting of a classic game or just display the HackWeek cute logo.
    • Practice some Go / Rust coding and programming patterns
    • Fiddle around, hack, learn, have fun
    • keep a development diary, practice on project documentation

    Follow this link for source code repository

    Some ideas for inspiration:

    Related projects:

    Resources


    SMB3 Server written entirely in Rust by dmulder

    Description

    Given the number of bugs frequently discovered in the Samba code caused by memory issues, it makes sense to re-write the smbd service purely in Rust code. Meanwhile, it would be wise to abandon backwards compatibility here with insecure protocol versions, and simply implement the SMB3 spec.

    Goals

    Get a simple server up and running and get it merged into upstream Samba (which now has Rust build support).

    Resources


    Write an url shortener in Rust (And learn in the way) by szarate

    So I have 469.icu :), it's currently doing nothing... (and for sale) but in the meantime, I'd like to write an url shortener from scratch and deploy it on my own server

    https://github.com/foursixnine/url-manager-rs/tree/main


    Kanidm: A safe and modern IDM system by firstyear

    Kanidm is an IDM system written in Rust for modern systems authentication. The github repo has a detailed "getting started" on the readme.

    Kanidm Github

    In addition Kanidm has spawn a number of adjacent projects in the Rust ecosystem such as LDAP, Kerberos, Webauthn, and cryptography libraries.

    In this hack week, we'll be working on Quokca, a certificate authority that supports PKCS11/TPM storage of keys, issuance of PIV certificates, and ACME without the feature gatekeeping implemented by other CA's like smallstep.

    For anyone who wants to participate in Kanidm, we have documentation and developer guides which can help.

    I'm happy to help and share more, so please get in touch!


    Hacking on sched_ext by flonnegren

    Description

    Sched_ext upstream has some interesting issues open for grabs:

    Goals

    Send patches to sched_ext upstream

    Also set up perfetto to trace some of the example schedulers.

    Resources

    https://github.com/sched-ext/scx


    Agama installer on-line demo by lslezak

    Description

    The Agama installer provides a quite complex user interface. We have some screenshots on the web page but as it is basically a web application it would be nice to have some on-line demo where users could click and check it live.

    The problem is that the Agama server directly accesses the hardware (storage probing) and loads installation repositories. We cannot easily mock this in the on-line demo so the easiest way is to have just a read-only demo. You could explore the configuration options but you could not change anything, all changes would be ignored.

    The read-only demo would be a bit limited but I still think it would be useful for potential users get the feeling of the new Agama installer and get familiar with it before using in a real installation.

    As a proof of concept I already created this on-line demo.

    The implementation basically builds Agama in two modes - recording mode where it saves all REST API responses and replay mode where it for the REST API requests returns the previously recorded responses. Recording in the browser is inconvenient and error prone, there should be some scripting instead (see below).

    Goals

    • Create an Agama on-line demo which can be easily tested by users
    • The Agama installer is still in alpha phase and in active development, the online demo needs to be easily rebuilt with the latest Agama version
    • Ideally there should be some automation so the demo page is rebuilt automatically without any developer interactions (once a day or week?)

    TODO

    • Use OpenAPI to get all Agama REST API endpoints, write a script which queries all the endpoints automatically and saves the collected data to a file (see this related PR).
    • Write a script for starting an Agama VM (use libvirt/qemu?), the script should ensure we always use the same virtual HW so if we need to dump the latest REST API state we get the same (or very similar data). This should ensure the demo page does not change much regarding the storage proposal etc...
    • Fix changing the product, currently it gets stuck after clicking the "Select" button.
    • Move the mocking data (the recorded REST API responses) outside the Agama sources, it's too big and will be probably often updated. To avoid messing the history keep it in a separate GitHub repository
    • Allow changing the UI language
    • Display some note (watermark) in the page so it is clear it is a read-only demo (probably with some version or build date to know how old it is)
    • Automation for building new demo page from the latest sources. There should be some check which ensures the recorded data still matches the OpenAPI specification.

    Changing the UI language

    This will be quite tricky because selecting the proper translation file is done on the server side. We would probably need to completely re-implement the logic in the browser side and adapt the server for that.

    Also some REST API responses contain translated texts (storage proposal, pattern names in software). We would need to query the respective endpoints in all supported languages and return the correct response in runtime according to the currently selected language.

    Resources