When we started brain storming a project for hack week, one of the floated ideas was to remake the 1983 film WarGames, and for lack of available space, a local lot with storage units was proposed. Over the course of the following years, while we planned, we realized that this whole idea would not be the most feasible, but it still felt like we were onto something.
Eventually, we settled with keeping the name and changing the scope.
Instead, we are proposing to create a virtual environment on which war games will be played, with a focus on storage -- in particular, with a focus on Ceph and its ecosystem.
What are war games?
Wikipedia has a few entries talking about the concept of war games, pertaining to military exercices and simulations. We are not going to go into much detail about what the whole concept is about for two reasons: 1) getting definitions 100% right is a deep rabbit hole that would consume our whole week, and 2) because "war game" is chosen mostly because it's a cool name.
In essence though, the concept relies on simulating adverse conditions, to develop, trial and refine possible solutions, without actual exposing the participants to real-life scenarios where failure would (potentially) be catastrophic.
In the context of software-defined storage, Ceph in particular, we are looking to leverage this concept to allow participants to develop their capabilities in recovering from cluster failures, as well understanding how failures are caused and how to prevent them.
General Overview
The ten thousand feet view involves two teams: Red, and Blue.
The Red team's takes the adversarial position, meant to cause as much trouble for the Blue Team as possible, while observing whatever constraints are established for the duration of the exercise. The Blue team's objective will be to keep a healthy, functioning cluster, thwarting Red's attempts at mayhem.
Each exercise will be bound by a set of constraints, defined before the exercise begins, and to be observed by both teams. For instance, if a constraint is "no data shall be deleted", then the Red team shall not delete the data on the disks. Remember, this is meant for people to learn, may it be by causing the problems or by fixing them.
The exercises will take place on virtual machines: a healthy cluster will be set up, with all the services that are meant to be running for a given exercise; there will be a login node, that shall be accessible by both teams. Teams will log into this node, and will issue their actions into the cluster from it.
Each team will have a predefined, non-overlapping time window to perform their actions on the cluster. Once the window closes, teams will no longer be able to login, connections will be booted off the node.
All actions, commands, shall be logged to a remote node, alongside with cluster health and other relevant information, for further analysis, postmortem, etc.
Hack Week's Objective
Getting this working. Some of it? All of it? Finding out how much we will be diverging from the initial objective by week's end. :)
Looking for hackers with the skills:
This project is part of:
Hack Week 17
Activity
Comments
Be the first to comment!
Similar Projects
Q2Boot - A handy QEMU VM launcher by amanzini
Description
Q2Boot (Qemu Quick Boot) is a command-line tool that wraps QEMU to provide a streamlined experience for launching virtual machines. It automatically configures common settings like KVM acceleration, virtio drivers, and networking while allowing customization through both configuration files and command-line options.
The project originally was a personal utility in D, now recently rewritten in idiomatic Go. It lives at repository https://github.com/ilmanzo/q2boot
Goals
Improve the project, testing with different scenarios , address issues and propose new features. It will benefit of some basic integration testing by providing small sample disk images.
Resources
Extracting, converting and importing VMs from Nutanix into SUSE Virtualization by emendonca
Description
The idea is to delve into understanding Nutanix AHV internals on how it stores and runs VMs, and how to extract them in an automated way for importing into a KVM-compatible hypervisor, like SUSE Virtualization/Harvester. The final product will be not only be documentation, but a working prototype that can be used to automate the process.
Goals
1) document how to create a simple lab with NutaniX AHV community edition 2) determine the basic elements we need to interact with 3) determine what are the best paths to grab the images through, balancing speed and complexity 4) document possible issues and create a roadmap for tackling them 4) should we adapt an existing solution or implement a new one? 5) implement the solution!
Resources
Similar project I created: https://github.com/doccaz/vm-import-ui Nutanix AHV forums Nutanix technical bulletins
SUSE KVM Best Practices by roseswe
Description
SUSE Best Practices around KVM, especially for SAP workloads. Early Google presentation already made from various customer projects and SUSE sources.
Goals
Complete presentation we can reuse in SUSE Consulting projects
Resources
KVM (virt-manager) images
SUSE/SAP/KVM Best Practices
- https://documentation.suse.com/en-us/sles/15-SP6/single-html/SLES-virtualization/
- SAP Note 1522993 - "Linux: SAP on SUSE KVM - Kernel-based Virtual Machine" && 2284516 - SAP HANA virtualized on SUSE Linux Enterprise hypervisors https://me.sap.com/notes/2284516
- SUSECon24: [TUTORIAL-1253] Virtualizing SAP workloads with SUSE KVM || https://youtu.be/PTkpRVpX2PM
- SUSE Best Practices for SAP HANA on KVM - https://documentation.suse.com/sbp/sap-15/html/SBP-SLES4SAP-HANAonKVM-SLES15SP4/index.html