Project Description

So you have an idea for a machine learning project for HackWeek. Have you thought about what tools you'll be using? Choosing the right set of machine learning tools and making them work together can be time consuming, not to mention the unavoidable learning curve. Perhaps you could use some help with that.

The SUSE AI/ML team has the answer: FuseML - an open source machine learning DevOps orchestrator that can get your machine learning projects up and running as easy as lighting a fuse.

FuseML started as a spin off project Carrier. Think "Carrier for Machine Learning": you write your ML application using one of the popular machine learning libraries (e.g. scikit-learn, TensorFlow, PyTorch, XGBoost) and FuseML takes care of all operations necessary to get your machine learning models in action, so you can concentrate on your code.

FuseML workflow

The catch: FuseML is still in a pre-alpha state, although it can already be used to showcase basic features. While using it, you may run into some corner cases we haven't covered yet, but you'll not be alone: we're here to help.

The rewards: access to expert knowledge in AI/ML and a chance to have your ML project published into the FuseML gallery of sample applications.

What you'll need: to install and use FuseML, you'll need a kubernetes cluster. If you don't already have one handy, or if you're low on hardware resources, you can install minikube, kind or k3s on your machine.

Goal for this Hackweek

  • discover new use cases and AI/ML tools to be enabled for FuseML
  • offer assistance and guidelines on AI/ML best practices and tools in the context of FuseML
  • pimp up FuseML's gallery of sample applications

Resources

This project is part of:

Hack Week 20

Activity

  • over 3 years ago: acho liked this project.
  • over 3 years ago: ories liked this project.
  • over 3 years ago: afesta liked this project.
  • over 3 years ago: jsuchome joined this project.
  • over 3 years ago: flaviosr liked this project.
  • over 3 years ago: flaviosr joined this project.
  • over 3 years ago: stefannica started this project.
  • over 3 years ago: stefannica added keyword "#fuseml" to this project.
  • over 3 years ago: stefannica added keyword "#ai" to this project.
  • over 3 years ago: stefannica added keyword "#machinelearning" to this project.
  • over 3 years ago: stefannica added keyword "#kubernetes" to this project.
  • over 3 years ago: stefannica added keyword "#artificial-intelligence" to this project.
  • over 3 years ago: stefannica added keyword "#mlops" to this project.
  • over 3 years ago: stefannica added keyword "#mlflow" to this project.
  • over 3 years ago: stefannica added keyword "#sklearn" to this project.
  • over 3 years ago: stefannica added keyword "#pytorch" to this project.
  • over 3 years ago: stefannica added keyword "#ternsorflow" to this project.
  • over 3 years ago: stefannica originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Research how LLMs could help to Linux developers and/or users by anicka

    Description

    Large language models like ChatGPT have demonstrated remarkable capabilities across a variety of applications. However, their potential for enhancing the Linux development and user ecosystem remains largely unexplored. This project seeks to bridge that gap by researching practical applications of LLMs to improve workflows in areas such as backporting, packaging, log analysis, system migration, and more. By identifying patterns that LLMs can leverage, we aim to uncover new efficiencies and automation strategies that can benefit developers, maintainers, and end users alike.

    Goals

    • Evaluate Existing LLM Capabilities: Research and document the current state of LLM usage in open-source and Linux development projects, noting successes and limitations.
    • Prototype Tools and Scripts: Develop proof-of-concept scripts or tools that leverage LLMs to perform specific tasks like automated log analysis, assisting with backporting patches, or generating packaging metadata.
    • Assess Performance and Reliability: Test the tools' effectiveness on real-world Linux data and analyze their accuracy, speed, and reliability.
    • Identify Best Use Cases: Pinpoint which tasks are most suitable for LLM support, distinguishing between high-impact and impractical applications.
    • Document Findings and Recommendations: Summarize results with clear documentation and suggest next steps for potential integration or further development.

    Resources

    • Local LLM Implementations: Access to locally hosted LLMs such as LLaMA, GPT-J, or similar open-source models that can be run and fine-tuned on local hardware.
    • Computing Resources: Workstations or servers capable of running LLMs locally, equipped with sufficient GPU power for training and inference.
    • Sample Data: Logs, source code, patches, and packaging data from openSUSE or SUSE repositories for model training and testing.
    • Public LLMs for Benchmarking: Access to APIs from platforms like OpenAI or Hugging Face for comparative testing and performance assessment.
    • Existing NLP Tools: Libraries such as spaCy, Hugging Face Transformers, and PyTorch for building and interacting with local LLMs.
    • Technical Documentation: Tutorials and resources focused on setting up and optimizing local LLMs for tasks relevant to Linux development.
    • Collaboration: Engagement with community experts and teams experienced in AI and Linux for feedback and joint exploration.


    Run local LLMs with Ollama and explore possible integrations with Uyuni by PSuarezHernandez

    Description

    Using Ollama you can easily run different LLM models in your local computer. This project is about exploring Ollama, testing different LLMs and try to fine tune them. Also, explore potential ways of integration with Uyuni.

    Goals

    • Explore Ollama
    • Test different models
    • Fine tuning
    • Explore possible integration in Uyuni

    Resources

    • https://ollama.com/
    • https://huggingface.co/
    • https://apeatling.com/articles/part-2-building-your-training-data-for-fine-tuning/


    Use AI tools to convert legacy perl scripts to bash by nadvornik

    Description

    Use AI tools to convert legacy perl scripts to bash

    Goals

    Uyuni project contains legacy perl scripts used for setup. The perl dependency could be removed, to reduce the container size. The goal of this project is to research use of AI tools for this task.

    Resources

    Aider

    Results:

    Aider is not the right tool for this. It works ok for small changes, but not for complete rewrite from one language to another.

    I got better results with direct API use from script.


    Gen-AI chatbots and test-automation of generated responses by mdati

    Description

    Start experimenting the generative SUSE-AI chat bot, asking questions on different areas of knowledge or science and possibly analyze the quality of the LLM model response, specific and comparative, checking the answers provided by different LLM models to a same query, using proper quality metrics or tools or methodologies.

    Try to define basic guidelines and requirements for quality test automation of AI-generated responses.

    First approach of investigation can be based on manual testing: methodologies, findings and data can be useful then to organize valid automated testing.

    Goals

    • Identify criteria and measuring scales for assessment of a text content.
    • Define quality of an answer/text based on defined criteria .
    • Identify some knowledge sectors and a proper list of problems/questions per sector.
    • Manually run query session and apply evaluation criteria to answers.
    • Draft requirements for test automation of AI answers.

    Resources

    • Announcement of SUSE-AI for Hack Week in Slack
    • Openplatform and related 3 LLM models gemma:2b, llama3.1:8b, qwen2.5-coder:3b.

    Notes

    • Foundation models (FMs):
      are large deep learning neural networks, trained on massive datasets, that have changed the way data scientists approach machine learning (ML). Rather than develop artificial intelligence (AI) from scratch, data scientists use a foundation model as a starting point to develop ML models that power new applications more quickly and cost-effectively.

    • Large language models (LLMs):
      are a category of foundation models pre-trained on immense amounts of data acquiring abilities by learning statistical relationships from vast amounts of text during a self- and semi-supervised training process, making them capable of understanding and generating natural language and other types of content , to perform a wide range of tasks.
      LLMs can be used for generative AI (artificial intelligence) to produce content based on input prompts in human language.

    Validation of a AI-generated answer is not an easy task to perform, as manually as automated.
    An LLM answer text shall contain a given level of informations: correcness, completeness, reasoning description etc.
    We shall rely in properly applicable and measurable criteria of validation to get an assessment in a limited amount of time and resources.


    Automated Test Report reviewer by oscar-barrios

    Description

    In SUMA/Uyuni team we spend a lot of time reviewing test reports, analyzing each of the test cases failing, checking if the test is a flaky test, checking logs, etc.

    Goals

    Speed up the review by automating some parts through AI, in a way that we can consume some summary of that report that could be meaningful for the reviewer.

    Resources

    No idea about the resources yet, but we will make use of:

    • HTML/JSON Report (text + screenshots)
    • The Test Suite Status GithHub board (via API)
    • The environment tested (via SSH)
    • The test framework code (via files)


    FamilyTrip Planner: A Personalized Travel Planning Platform for Families by pherranz

    Description

    FamilyTrip Planner is an innovative travel planning application designed to optimize travel experiences for families with children. By integrating APIs for flights, accommodations, and local activities, the app generates complete itineraries tailored to each family’s unique interests and needs. Recommendations are based on customizable parameters such as destination, trip duration, children’s ages, and personal preferences. FamilyTrip Planner not only simplifies the travel planning process but also offers a comprehensive, personalized experience for families.

    Goals

    This project aims to: - Create a user-friendly platform that assists families in planning complete trips, from flight and accommodation options to recommended family-friendly activities. - Provide intelligent, personalized travel itineraries using artificial intelligence to enhance travel enjoyment and minimize time and cost. - Serve as an educational project for exploring Go programming and artificial intelligence, with the goal of building proficiency in both.

    Resources

    To develop FamilyTrip Planner, the project will leverage: - APIs such as Skyscanner, Google Places, and TripAdvisor to source real-time information on flights, accommodations, and activities. - Go programming language to manage data integration, API connections, and backend development. - Basic machine learning libraries to implement AI-driven itinerary suggestions tailored to family needs and preferences.


    Learn enough Golang and hack on CoreDNS by jkuzilek

    Description

    I'm implementing a split-horizon DNS for my home Kubernetes cluster to be able to access my internal (and external) services over the local network through public domains. I managed to make a PoC with the k8s_gateway plugin for CoreDNS. However, I soon found out it responds with IPs for all Gateways assigned to HTTPRoutes, publishing public IPs as well as the internal Loadbalancer ones.

    To remedy this issue, a simple filtering mechanism has to be implemented.

    Goals

    • Learn an acceptable amount of Golang
    • Implement GatewayClass (and IngressClass) filtering for k8s_gateway
    • Deploy on homelab cluster
    • Profit?

    Resources

    EDIT: Feature mostly complete. An unfinished PR lies here. Successfully tested working on homelab cluster.


    Small healthcheck tool for Longhorn by mbrookhuis

    Project Description

    We have often problems (e.g. pods not starting) that are related to PVCs not running, cluster (nodes) not all up or deployments not running or completely running. This all prevents administration activities. Having something that can regular be run to validate the status of the cluster would be helpful, and not as of today do a lot of manual tasks.

    As addition (read enough time), we could add changing reservation, adding new disks, etc. --> This didn't made it. But the scripts can easily be adopted.

    This tool would decrease troubleshooting time, giving admins rights to the rancher GUI and could be used in automation.

    Goal for this Hackweek

    At the end we should have a small python tool that is doing a (very) basic health check on nodes, deployments and PVCs. First attempt was to make it in golang, but that was taking to much time.

    Overview

    This tool will run a simple healthcheck on a kubernetes cluster. It will perform the following actions:

    • node check: This will check all nodes, and display the status and the k3s version. If the status of the nodes is not "Ready" (this should be only reported), the cluster will be reported as having problems

    • deployment check: This check will list all deployments, and display the number of expected replicas and the used replica. If there are unused replicas this will be displayed. The cluster will be reported as having problems.

    • pvc check: This check will list of all pvc's, and display the status and the robustness. If the robustness is not "Healthy", the cluster will be reported as having problems.

    If there is a problem registered in the checks, there will be a warning that the cluster is not healthy and the program will exit with 1.

    The script has 1 mandatory parameter and that is the kubeconf of the cluster or of a node off the cluster.

    The code is writen for Python 3.11, but will also work on 3.6 (the default with SLES15.x). There is a venv present that will contain all needed packages. Also, the script can be run on the cluster itself or any other linux server.

    Installation

    To install this project, perform the following steps:

    • Create the directory /opt/k8s-check

    mkdir /opt/k8s-check

    • Copy all the file to this directory and make the following changes:

    chmod +x k8s-check.py


    Setup Kanidm as OIDC provider on Kubernetes by jkuzilek

    Description

    I am planning to upgrade my homelab Kubernetes cluster to the next level and need an OIDC provider for my services, including K8s itself.

    Goals

    • Successfully configure and deploy Kanidm on homelab cluster
    • Integrate with K8s auth
    • Integrate with other services (Envoy Gateway, Container Registry, future deployment of Forgejo?)

    Resources


    Install Uyuni on Kubernetes in cloud-native way by cbosdonnat

    Description

    For now installing Uyuni on Kubernetes requires running mgradm on a cluster node... which is not what users would do in the Kubernetes world. The idea is to implement an installation based only on helm charts and probably an operator.

    Goals

    Install Uyuni from Rancher UI.

    Resources


    ddflare: (Dynamic)DNS management via Cloudflare API in Kubernetes by fgiudici

    Description

    ddflare is a project started a couple of weeks ago to provide DDNS management using v4 Cloudflare APIs: Cloudflare offers management via APIs and access tokens, so it is possible to register a domain and implement a DynDNS client without any other external service but their API.

    Since ddflare allows to set any IP to any domain name, one could manage multiple A and ALIAS domain records. Wouldn't be cool to allow full DNS control from the project and integrate it with your Kubernetes cluster?

    Goals

    Main goals are:

    1. add containerized image for ddflare
    2. extend ddflare to be able to add and remove DNS records (and not just update existing ones)
    3. add documentation, covering also a sample pod deployment for Kubernetes
    4. write a ddflare Kubernetes operator to enable domain management via Kubernetes resources (using kubebuilder)

    Available tasks and improvements tracked on ddflare github.

    Resources

    • https://github.com/fgiudici/ddflare
    • https://developers.cloudflare.com/api/
    • https://book.kubebuilder.io


    Save pytorch models in OCI registries by jguilhermevanz

    Description

    A prerequisite for running applications in a cloud environment is the presence of a container registry. Another common scenario is users performing machine learning workloads in such environments. However, these types of workloads require dedicated infrastructure to run properly. We can leverage these two facts to help users save resources by storing their machine learning models in OCI registries, similar to how we handle some WebAssembly modules. This approach will save users the resources typically required for a machine learning model repository for the applications they need to run.

    Goals

    Allow PyTorch users to save and load machine learning models in OCI registries.

    Resources


    Save pytorch models in OCI registries by jguilhermevanz

    Description

    A prerequisite for running applications in a cloud environment is the presence of a container registry. Another common scenario is users performing machine learning workloads in such environments. However, these types of workloads require dedicated infrastructure to run properly. We can leverage these two facts to help users save resources by storing their machine learning models in OCI registries, similar to how we handle some WebAssembly modules. This approach will save users the resources typically required for a machine learning model repository for the applications they need to run.

    Goals

    Allow PyTorch users to save and load machine learning models in OCI registries.

    Resources


    Make more sense of openQA test results using AI by livdywan

    Description

    AI has the potential to help with something many of us spend a lot of time doing which is making sense of openQA logs when a job fails.

    User Story

    Allison Average has a puzzled look on their face while staring at log files that seem to make little sense. Is this a known issue, something completely new or maybe related to infrastructure changes?

    Goals

    • Leverage a chat interface to help Allison
    • Create a model from scratch based on data from openQA
    • Proof of concept for automated analysis of openQA test results

    Bonus

    • Use AI to suggest solutions to merge conflicts
      • This would need a merge conflict editor that can suggest solving the conflict
    • Use image recognition for needles

    Resources

    Timeline

    Day 1

    • Conversing with open-webui to teach me how to create a model based on openQA test results

    Day 2

    Highlights

    • I briefly tested compared models to see if they would make me more productive. Between llama, gemma and mistral there was no amazing difference in the results for my case.
    • Convincing the chat interface to produce code specific to my use case required very explicit instructions.
    • Asking for advice on how to use open-webui itself better was frustratingly unfruitful both in trivial and more advanced regards.
    • Documentation on source materials used by LLM's and tools for this purpose seems virtually non-existent - specifically if a logo can be generated based on particular licenses

    Outcomes

    • Chat interface-supported development is providing good starting points and open-webui being open source is more flexible than Gemini. Although currently some fancy features such as grounding and generated podcasts are missing.
    • Allison still has to be very experienced with openQA to use a chat interface for test review. Publicly available system prompts would make that easier, though.