Project Description

MONAI Deploy aims to become the de-facto standard for developing packaging, testing, deploying, and running medical AI applications in clinical production. MONAI Deploy creates a set of intermediate steps where researchers and physicians can build confidence in the techniques and approaches used with AI — allowing for an iterative workflow.

Contributors to MONAI include Nvidia, Mayo Clinic, King's College London, NHS, Standford University, ... and many others.

The core piece of MONAI Deploy is the Monai Application Package (MAP), which contains the ML model in a "runnable state". It is implemented as a container, and built with docker-nvidia. This is relevant because you need a GPU to build the MAP. You'll see later why this is relevant. This container can be later run on a k8s cluster.

Before you create a MAP, you need to train the ML models. They are trained with MONAI Core, another framework which is a piece of the whole MONAI "puzzle". Those models can be published in the MONAI Model Zoo. They are published using a very specific format, which is called "a bundle".

In the monai deploy app sdk project in github, you can see several examples on how to package a "model bundle" into a MAP. Plus in the documentation you can find a step by step guide on how to build them, meaning how to create the MAP (the container). Examples, both code and documentation, use the ML models in the MONAI model Zoo.

MONAI model Zoo is free, you can search for models and use them for your research. However, there is not such a repository for MAPs, even the docs and examples show how to build those models into MAPS.

And this is the motivation of the project, to create this "link", and release into a registry, at least one MAP based on a model in MONAI Model Zoo.

Goal for this Hackweek

The specific goal is to implement a Continuous Integration workflow that builds a MAP (Monai Application Package), based on the example in code and documentation. Specifically, it is to implement a github action workflow that releases it into github container registry.

Implementation

The github action workflow will be added to a fork of the monai-deploy-app-sdk project, given we will be using the code that is already in the examples directory. Later a Pull Request can be created to the upstream project.

A limitation of this project is that we need to run the github action in a GPU node. ASFAIK github does not support that, so we need to run this on an external runner. For that I will be using MS Azure cloud to host a vm with GPU. For 3 reasons: first, it should be faster to clone from github from azure; second, I will try to use the free 90 days; third, I want to get familiar with Azure.

Finally, most probably I will use terraform to deploy the node in Azure.

This way, every time we want to release a new model in the MAP format, we will deploy a vm in Azure, do the build with the GPU, release into the github container registry, and remove the vm.

Resources

https://monai.io/ https://monai.io/model-zoo.html https://docs.monai.io/projects/monai-deploy-app-sdk/en/latest/gettingstarted/tutorials/monaibundle_app.html https://github.com/Project-MONAI/monai-deploy-app-sdk/tree/main/examples/apps

This project is part of:

Hack Week 23

Activity

  • about 2 years ago: jordimassaguerpla joined this project.
  • about 2 years ago: vliaskovitis liked this project.
  • about 2 years ago: rtsvetkov started this project.
  • about 2 years ago: jordimassaguerpla added keyword "github_actions" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "github-ci" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "ci" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "gpu" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "azure" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "cloud" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "monai" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "medical" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "containers" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "ml" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "mlops" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "ai" to this project.
  • about 2 years ago: jordimassaguerpla added keyword "artificial-intelligence" to this project.
  • about 2 years ago: jordimassaguerpla originated this project.

  • Comments

    • jordimassaguerpla
      about 2 years ago by jordimassaguerpla | Reply

      I was able to create a terraform file and a workflow file but then I was not able to make the build work.

      Here the terraform file:

      https://github.com/jordimassaguerpla/monai-deploy-app-sdk/blob/main/main.tf

      Here the workflow file:

      https://github.com/jordimassaguerpla/monai-deploy-app-sdk/blob/main/.github/workflows/buildandpush_models.yml

    • jordimassaguerpla
      about 2 years ago by jordimassaguerpla | Reply

      I think the issue is that it tries to load the container, but I had not installed nvidia-docker2, and thus it can't load the container.

    • jordimassaguerpla
      about 2 years ago by jordimassaguerpla | Reply

      Here the fix for libseccomp, so nvidia-container-toolkit can be installed: https://build.opensuse.org/request/show/1128309 Here the fix for nvidia-holoscan: https://github.com/nvidia-holoscan/holoscan-sdk/pull/14

      With these 2 fixes and by increasing the Disc in Azure to 64GB, I was able to build the ML model as a container :)

    • jordimassaguerpla
      about 2 years ago by jordimassaguerpla | Reply

      And voilà, here the MONAI Application Package ready to be used:

      https://github.com/jordimassaguerpla/monai-deploy-app-sdk/pkgs/container/monai-deploy-app-sdk%2Fsimple_app-x64-workstation-dgpu-linux-amd64

    Similar Projects

    Kubernetes-Based ML Lifecycle Automation by lmiranda

    Description

    This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.

    The pipeline will automate the lifecycle of a machine learning model, including:

    • Data ingestion/collection
    • Model training as a Kubernetes Job
    • Model artifact storage in an S3-compatible registry (e.g. Minio)
    • A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
    • A lightweight inference service that loads and serves the latest model
    • Monitoring of model performance and service health through Prometheus/Grafana

    The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.

    Goals

    By the end of Hack Week, the project should:

    1. Produce a fully functional ML pipeline running on Kubernetes with:

      • Data collection job
      • Training job container
      • Storage and versioning of trained models
      • Automated deployment of new model versions
      • Model inference API service
      • Basic monitoring dashboards
    2. Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.

    3. Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).

    4. Prepare a short demo explaining the end-to-end process and how new models flow through the system.

    Resources

    Project Repository


    Kubernetes-Based ML Lifecycle Automation by lmiranda

    Description

    This project aims to build a complete end-to-end Machine Learning pipeline running entirely on Kubernetes, using Go, and containerized ML components.

    The pipeline will automate the lifecycle of a machine learning model, including:

    • Data ingestion/collection
    • Model training as a Kubernetes Job
    • Model artifact storage in an S3-compatible registry (e.g. Minio)
    • A Go-based deployment controller that automatically deploys new model versions to Kubernetes using Rancher
    • A lightweight inference service that loads and serves the latest model
    • Monitoring of model performance and service health through Prometheus/Grafana

    The outcome is a working prototype of an MLOps workflow that demonstrates how AI workloads can be trained, versioned, deployed, and monitored using the Kubernetes ecosystem.

    Goals

    By the end of Hack Week, the project should:

    1. Produce a fully functional ML pipeline running on Kubernetes with:

      • Data collection job
      • Training job container
      • Storage and versioning of trained models
      • Automated deployment of new model versions
      • Model inference API service
      • Basic monitoring dashboards
    2. Showcase a Go-based deployment automation component, which scans the model registry and automatically generates & applies Kubernetes manifests for new model versions.

    3. Enable continuous improvement by making the system modular and extensible (e.g., additional models, metrics, autoscaling, or drift detection can be added later).

    4. Prepare a short demo explaining the end-to-end process and how new models flow through the system.

    Resources

    Project Repository


    SUSE Edge Image Builder MCP by eminguez

    Description

    Based on my other hackweek project, SUSE Edge Image Builder's Json Schema I would like to build also a MCP to be able to generate EIB config files the AI way.

    Realistically I don't think I'll be able to have something consumable at the end of this hackweek but at least I would like to start exploring MCPs, the difference between an API and MCP, etc.

    Goals

    • Familiarize myself with MCPs
    • Unrealistic: Have an MCP that can generate an EIB config file

    Resources


    Gemini-Powered Systemic Bug Evaluation and Management Assistant by rtsvetkov

    Description

    To build a tool or system that takes a raw bug report (including error messages and context) and uses a large language model (LLM) to generate a series of structured, Socratic-style or Systemic questions designed to guide a the integration and development toward the root cause, rather than just providing a direct, potentially incorrect fix.

    Goals

    Set up a Python environment

    Set the environment and get a Gemini API key. 2. Collect 5-10 realistic bug reports (from open-source projects, personal projects, or public forums like Stack Overflow—include the error message and the initial context).

    Build the Dialogue Loop

    1. Write a basic Python script using the Gemini API.
    2. Implement a simple conversational loop: User Input (Bug) -> AI Output (Question) -> User Input (Answer to AI's question) -> AI Output (Next Question). Code Implementation

    Socratic Strategy Implementation

    1. Refine the logic to ensure the questions follow a Socratic path (e.g., from symptom-> context -> assumptions -> root cause).
    2. Implement Function Calling (an advanced feature of the Gemini API) to suggest specific actions to the user, like "Run a ping test" or "Check the database logs."

    Resources

    What are Systemic Questions?

    Systemic questions explore the relationships, patterns, and interactions within a system rather than focusing on isolated elements.
    In IT, they help uncover hidden dependencies, feedback loops, assumptions, and side-effects during debugging or architecture analysis.


    Background Coding Agent by mmanno

    Description

    I had only bad experiences with AI one-shots. However, monitoring agent work closely and interfering often did result in productivity gains.

    Now, other companies are using agents in pipelines. That makes sense to me, just like CI, we want to offload work to pipelines: Our engineering teams are consistently slowed down by "toil": low-impact, repetitive maintenance tasks. A simple linter rule change, a dependency bump, rebasing patch-sets on top of newer releases or API deprecation requires dozens of manual PRs, draining time from feature development.

    So far we have been writing deterministic, script-based automation for these tasks. And it turns out to be a common trap. These scripts are brittle, complex, and become a massive maintenance burden themselves.

    Can we make prompts and workflows smart enough to succeed at background coding?

    Goals

    We will build a platform that allows engineers to execute complex code transformations using prompts.

    By automating this toil, we accelerate large-scale migrations and allow teams to focus on high-value work.

    Our platform will consist of three main components:

    • "Change" Definition: Engineers will define a transformation as a simple, declarative manifest:
      • The target repositories.
      • A wrapper to run a "coding agent", e.g., "gemini-cli".
      • The task as a natural language prompt.
    • "Change" Management Service: A central service that orchestrates the jobs. It will receive Change definitions and be responsible for the job lifecycle.
    • Execution Runners: We could use existing sandboxed CI runners (like GitHub/GitLab runners) to execute each job or spawn a container.

    MVP

    • Define the Change manifest format.
    • Build the core Management Service that can accept and queue a Change.
    • Connect management service and runners, dynamically dispatch jobs to runners.
    • Create a basic runner script that can run a hard-coded prompt against a test repo and open a PR.

    Stretch Goals:

    • Multi-layered approach, Workflow Agents trigger Coding Agents:
      1. Workflow Agent: Gather information about the task interactively from the user.
      2. Coding Agent: Once the interactive agent has refined the task into a clear prompt, it hands this prompt off to the "coding agent." This background agent is responsible for executing the task and producing the actual pull request.
    • Use MCP:
      1. Workflow Agent gathers context information from Slack, Github, etc.
      2. Workflow Agent triggers a Coding Agent.
    • Create a "Standard Task" library with reliable prompts.
      1. Rebasing rancher-monitoring to a new version of kube-prom-stack
      2. Update charts to use new images
      3. Apply changes to comply with a new linter
      4. Bump complex Go dependencies, like k8s modules
      5. Backport pull requests to other branches
    • Add “review agents” that review the generated PR.

    See also


    MCP Server for SCC by digitaltomm

    Description

    Provide an MCP Server implementation for customers to access data on scc.suse.com via MCP protocol. Similar to the organization APIs, this can expose to customers data about their subscriptions, orders, systems and products. Authentication should be done by organization credentials, similar to what needs to be provided to RMT/MLM. Customers can connect to the SCC MCP server from their own MCP-compatible client and Large Language Model (LLM), so no third party is involved.

    Schema

    Goals

    We want to demonstrate a proof of concept to connect to the SCC MCP server with any AI agent, for example gemini-cli or codex. Enabling the user to ask questions regarding their SCC inventory.

    For this Hackweek, we target that users get proper responses to these example questions:

    • Which of my currently active systems are running products that are out of support?
    • Do I have unused registration codes for SLES?
    • What are the latest released patches for SLES 15SP7?
    • Which versions of kernel-default are available on SLES 16.0?

    Milestones

    [x] Basic MCP API setup
      MCP endpoints
      [ ] Products / Repositories
      [x] Subscriptions / Orders 
      [x] Systems
      [x] Packages
    [ ] Document usage with Gemini CLI, Codex
    

    Resources

    Gemini CLI setup:

    ~/.gemini/settings.json:

    {  
     "mcpServers": {  
       "SCC local http": {  
         "httpUrl": "http://localhost:3000/mcp",  
         "note": "MCP server for SUSE Customer Center"  
       }  
    


    Multi-agent AI assistant for Linux troubleshooting by doreilly

    Description

    Explore multi-agent architecture as a way to avoid MCP context rot.

    Having one agent with many tools bloats the context with low-level details about tool descriptions, parameter schemas etc which hurts LLM performance. Instead have many specialised agents, each with just the tools it needs for its role. A top level supervisor agent takes the user prompt and delegates to appropriate sub-agents.

    Goals

    Create an AI assistant with some sub-agents that are specialists at troubleshooting Linux subsystems, e.g. systemd, selinux, firewalld etc. The agents can get information from the system by implementing their own tools with simple function calls, or use tools from MCP servers, e.g. a systemd-agent can use tools from systemd-mcp.

    Example prompts/responses:

    user$ the system seems slow
    assistant$ process foo with pid 12345 is using 1000% cpu ...
    
    user$ I can't connect to the apache webserver
    assistant$ the firewall is blocking http ... you can open the port with firewall-cmd --add-port ...
    

    Resources

    Language Python. The Python ADK is more mature than Golang.

    https://google.github.io/adk-docs/

    https://github.com/djoreilly/linux-helper


    SUSE Observability MCP server by drutigliano

    Description

    The idea is to implement the SUSE Observability Model Context Protocol (MCP) Server as a specialized, middle-tier API designed to translate the complex, high-cardinality observability data from StackState (topology, metrics, and events) into highly structured, contextually rich, and LLM-ready snippets.

    This MCP Server abstract the StackState APIs. Its primary function is to serve as a Tool/Function Calling target for AI agents. When an AI receives an alert or a user query (e.g., "What caused the outage?"), the AI calls an MCP Server endpoint. The server then fetches the relevant operational facts, summarizes them, normalizes technical identifiers (like URNs and raw metric names) into natural language concepts, and returns a concise JSON or YAML payload. This payload is then injected directly into the LLM's prompt, ensuring the final diagnosis or action is grounded in real-time, accurate SUSE Observability data, effectively minimizing hallucinations.

    Goals

    • Grounding AI Responses: Ensure that all AI diagnoses, root cause analyses, and action recommendations are strictly based on verifiable, real-time data retrieved from the SUSE Observability StackState platform.
    • Simplifying Data Access: Abstract the complexity of StackState's native APIs (e.g., Time Travel, 4T Data Model) into simple, semantic functions that can be easily invoked by LLM tool-calling mechanisms.
    • Data Normalization: Convert complex, technical identifiers (like component URNs, raw metric names, and proprietary health states) into standardized, natural language terms that an LLM can easily reason over.
    • Enabling Automated Remediation: Define clear, action-oriented MCP endpoints (e.g., execute_runbook) that allow the AI agent to initiate automated operational workflows (e.g., restarts, scaling) after a diagnosis, closing the loop on observability.

     Hackweek STEP

    • Create a functional MCP endpoint exposing one (or more) tool(s) to answer queries like "What is the health of service X?") by fetching, normalizing, and returning live StackState data in an LLM-ready format.

     Scope

    • Implement read-only MCP server that can:
      • Connect to a live SUSE Observability instance and authenticate (with API token)
      • Use tools to fetch data for a specific component URN (e.g., current health state, metrics, possibly topology neighbors, ...).
      • Normalize response fields (e.g., URN to "Service Name," health state DEVIATING to "Unhealthy", raw metrics).
      • Return the data as a structured JSON payload compliant with the MCP specification.

    Deliverables

    • MCP Server v0.1 A running Python web server (e.g., using FastAPI) with at least one tool.
    • A README.md and a test script (e.g., curl commands or a simple notebook) showing how an AI agent would call the endpoint and the resulting JSON payload.

    Outcome A functional and testable API endpoint that proves the core concept: translating complex StackState data into a simple, LLM-ready format. This provides the foundation for developing AI-driven diagnostics and automated remediation.

    Resources

    • https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine
    • https://www.datadoghq.com/blog/datadog-remote-mcp-server
    • https://modelcontextprotocol.io/specification/2025-06-18/index
    • https://modelcontextprotocol.io/docs/develop/build-server

     Basic implementation

    • https://github.com/drutigliano19/suse-observability-mcp-server


    Provide personal account sign-in for Himmelblau (Azure Entra Id) by dmulder

    Description

    Himmelblau currently does not support personal account sign-in, but only sign-in for business/school accounts. Adding personal account sign-in would broaden the userbase, and potentially attract more users to Linux from Windows (since they could easily migrate their existing Windows account, etc).

    Goals

    Implement personal account sign-in for libhimmelblau.

    Resources

    gitlab.com/samba-team/libhimmelblau


    Create a Cloud-Native policy engine with notifying capabilities to optimize resource usage by gbazzotti

    Description

    The goal of this project is to begin the initial phase of development of an all-in-one Cloud-Native Policy Engine that notifies resource owners when their resources infringe predetermined policies. This was inspired by a current issue in the CES-SRE Team where other solutions seemed to not exactly correspond to the needs of the specific workloads running on the Public Cloud Team space.

    The initial architecture can be checked out on the Repository listed under Resources.

    Among the features that will differ this project from other monitoring/notification systems:

    • Pre-defined sensible policies written at the software-level, avoiding a learning curve by requiring users to write their own policies
    • All-in-one functionality: logging, mailing and all other actions are not required to install any additional plugins/packages
    • Easy account management, being able to parse all required configuration by a single JSON file
    • Eliminate integrations by not requiring metrics to go through a data-agreggator

    Goals

    • Create a minimal working prototype following the workflow specified on the documentation
    • Provide instructions on installation/usage
    • Work on email notifying capabilities

    Resources


    Technical talks at universities by agamez

    Description

    This project aims to empower the next generation of tech professionals by offering hands-on workshops on containerization and Kubernetes, with a strong focus on open-source technologies. By providing practical experience with these cutting-edge tools and fostering a deep understanding of open-source principles, we aim to bridge the gap between academia and industry.

    For now, the scope is limited to Spanish universities, since we already have the contacts and have started some conversations.

    Goals

    • Technical Skill Development: equip students with the fundamental knowledge and skills to build, deploy, and manage containerized applications using open-source tools like Kubernetes.
    • Open-Source Mindset: foster a passion for open-source software, encouraging students to contribute to open-source projects and collaborate with the global developer community.
    • Career Readiness: prepare students for industry-relevant roles by exposing them to real-world use cases, best practices, and open-source in companies.

    Resources

    • Instructors: experienced open-source professionals with deep knowledge of containerization and Kubernetes.
    • SUSE Expertise: leverage SUSE's expertise in open-source technologies to provide insights into industry trends and best practices.


    Rewrite Distrobox in go (POC) by fabriziosestito

    Description

    Rewriting Distrobox in Go.

    Main benefits:

    • Easier to maintain and to test
    • Adapter pattern for different container backends (LXC, systemd-nspawn, etc.)

    Goals

    • Build a minimal starting point with core commands
    • Keep the CLI interface compatible: existing users shouldn't notice any difference
    • Use a clean Go architecture with adapters for different container backends
    • Keep dependencies minimal and binary size small
    • Benchmark against the original shell script

    Resources

    • Upstream project: https://github.com/89luca89/distrobox/
    • Distrobox site: https://distrobox.it/
    • ArchWiki: https://wiki.archlinux.org/title/Distrobox


    Help Create A Chat Control Resistant Turnkey Chatmail/Deltachat Relay Stack - Rootless Podman Compose, OpenSUSE BCI, Hardened, & SELinux by 3nd5h1771fy

    Description

    The Mission: Decentralized & Sovereign Messaging

    FYI: If you have never heard of "Chatmail", you can visit their site here, but simply put it can be thought of as the underlying protocol/platform decentralized messengers like DeltaChat use for their communications. Do not confuse it with the honeypot looking non-opensource paid for prodect with better seo that directs you to chatmailsecure(dot)com

    In an era of increasing centralized surveillance by unaccountable bad actors (aka BigTech), "Chat Control," and the erosion of digital privacy, the need for sovereign communication infrastructure is critical. Chatmail is a pioneering initiative that bridges the gap between classic email and modern instant messaging, offering metadata-minimized, end-to-end encrypted (E2EE) communication that is interoperable and open.

    However, unless you are a seasoned sysadmin, the current recommended deployment method of a Chatmail relay is rigid, fragile, difficult to properly secure, and effectively takes over the entire host the "relay" is deployed on.

    Why This Matters

    A simple, host agnostic, reproducible deployment lowers the entry cost for anyone wanting to run a privacy‑preserving, decentralized messaging relay. In an era of perpetually resurrected chat‑control legislation threats, EU digital‑sovereignty drives, and many dangers of using big‑tech messaging platforms (Apple iMessage, WhatsApp, FB Messenger, Instagram, SMS, Google Messages, etc...) for any type of communication, providing an easy‑to‑use alternative empowers:

    • Censorship resistance - No single entity controls the relay; operators can spin up new nodes quickly.
    • Surveillance mitigation - End‑to‑end OpenPGP encryption ensures relay operators never see plaintext.
    • Digital sovereignty - Communities can host their own infrastructure under local jurisdiction, aligning with national data‑policy goals.

    By turning the Chatmail relay into a plug‑and‑play container stack, we enable broader adoption, foster a resilient messaging fabric, and give developers, activists, and hobbyists a concrete tool to defend privacy online.

    Goals

    As I indicated earlier, this project aims to drastically simplify the deployment of Chatmail relay. By converting this architecture into a portable, containerized stack using Podman and OpenSUSE base container images, we can allow anyone to deploy their own censorship-resistant, privacy-preserving communications node in minutes.

    Our goal for Hack Week: package every component into containers built on openSUSE/MicroOS base images, initially orchestrated with a single container-compose.yml (podman-compose compatible). The stack will:

    • Run on any host that supports Podman (including optimizations and enhancements for SELinux‑enabled systems).
    • Allow network decoupling by refactoring configurations to move from file-system constrained Unix sockets to internal TCP networking, allowing containers achieve stricter isolation.
    • Utilize Enhanced Security with SELinux by using purpose built utilities such as udica we can quickly generate custom SELinux policies for the container stack, ensuring strict confinement superior to standard/typical Docker deployments.
    • Allow the use of bind or remote mounted volumes for shared data (/var/vmail, DKIM keys, TLS certs, etc.).
    • Replace the local DNS server requirement with a remote DNS‑provider API for DKIM/TXT record publishing.

    By delivering a turnkey, host agnostic, reproducible deployment, we lower the barrier for individuals and small communities to launch their own chatmail relays, fostering a decentralized, censorship‑resistant messaging ecosystem that can serve DeltaChat users and/or future services adopting this protocol

    Resources


    DNS management with DNSControl by itorres

    Description

    We use several systems to manage DNS at SUSE and openSUSE: BIND, external providers, PowerDNS... each of them is managed in a different way either with raw zones (BIND) or Terraform (external providers).

    DNSControl is an opinionated tool to manage DNS as code while being provider agnostic. It's developed and used by StackExchange, was spearheaded by Tom Limoncelly and is already being used to manage DNS for openSUSE.

    Implementing DNSControl should allow us to have a single DNS operations interface that end users can leverage.

    This would reduce complexity for end users as they can use a single simplified ECMAScript based DSL instead of BIND zones for internal and HCL config for external.

    Operations for our IT organization would be greatly reduced. DNSControl itself has several internal checks that reduce our need to do linting and we can concentrate on implementing logical checks based on ownership.

    This simplifies reviews a lot and the integration with BIND and providers allows our IT organization to implement an apply on merge.

    At an organizational level it will separate our DNS tasks from other IT operations, speeding up DNS changes and allowing us to delegate DNS reviews to service desk or even customer teams through CODEOWNERS.

    Goals

    • Create a test subdomain in one of our internal BIND servers to be managed with DNSControl.
    • Create an internal DNSControl repository to implement gitops for DNS.
    • Deploy DNS changes strictly through gitops.

    Extended goals

    • Implement CODEOWNERS.
    • Replicate main goals for external DNS.

    Resources