Project Description
Generate a personalized avatar artwork images by fine-tuning stable diffusion on personal pictures
Goal for this Hackweek
Get a new fancy and unique avatar!
Resources
- https://huggingface.co/docs/diffusers/using-diffusers/sdxl
- https://huggingface.co/docs/diffusers/training/dreambooth
- https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sdxl.md
- https://civitai.com/models/133005/juggernaut-xl?modelVersionId=198530
Looking for hackers with the skills:
This project is part of:
Hack Week 23
Activity
Comments
-
about 1 year ago by STorresi | Reply
These are generated after a bespoke LoRA training using DreamBooth over the JuggernautXL model, which in turn is based on SDXL 1.0.
As you can see, hands are still tricky (a known issue of diffusion models, apparently), but I didn't try inpainting and img2img fine-tuning, which are supposed to be the go-to way to solve small issues like that. I must say the overall experience was quite painful due to the hardware requirements of SDXL and the amount of memory leaks in pytorch. A high-end consumer grade GPU like an NVIDIA 4080 with 16GB of VRAM often wasn't enough and ran OOM.
Similar Projects
Save pytorch models in OCI registries by jguilhermevanz
Description
A prerequisite for running applications in a cloud environment is the presence of a container registry. Another common scenario is users performing machine learning workloads in such environments. However, these types of workloads require dedicated infrastructure to run properly. We can leverage these two facts to help users save resources by storing their machine learning models in OCI registries, similar to how we handle some WebAssembly modules. This approach will save users the resources typically required for a machine learning model repository for the applications they need to run.
Goals
Allow PyTorch users to save and load machine learning models in OCI registries.
Resources
Make more sense of openQA test results using AI by livdywan
Description
AI has the potential to help with something many of us spend a lot of time doing which is making sense of openQA logs when a job fails.
User Story
Allison Average has a puzzled look on their face while staring at log files that seem to make little sense. Is this a known issue, something completely new or maybe related to infrastructure changes?
Goals
- Leverage a chat interface to help Allison
- Create a model from scratch based on data from openQA
- Proof of concept for automated analysis of openQA test results
Bonus
- Use AI to suggest solutions to merge conflicts
- This would need a merge conflict editor that can suggest solving the conflict
- Use image recognition for needles
Resources
Timeline
Day 1
- Conversing with open-webui to teach me how to create a model based on openQA test results
- Asking for example code using TensorFlow in Python
- Discussing log files to explore what to analyze
- Drafting a new project called Testimony (based on Implementing a containerized Python action) - the project name was also suggested by the assistant
Day 2
- Using NotebookLLM (Gemini) to produce conversational versions of blog posts
- Researching the possibility of creating a project logo with AI
- Asking open-webui, persons with prior experience and conducting a web search for advice
Highlights
- I briefly tested compared models to see if they would make me more productive. Between llama, gemma and mistral there was no amazing difference in the results for my case.
- Convincing the chat interface to produce code specific to my use case required very explicit instructions.
- Asking for advice on how to use open-webui itself better was frustratingly unfruitful both in trivial and more advanced regards.
- Documentation on source materials used by LLM's and tools for this purpose seems virtually non-existent - specifically if a logo can be generated based on particular licenses
Outcomes
- Chat interface-supported development is providing good starting points and open-webui being open source is more flexible than Gemini. Although currently some fancy features such as grounding and generated podcasts are missing.
- Allison still has to be very experienced with openQA to use a chat interface for test review. Publicly available system prompts would make that easier, though.
SUSE AI Meets the Game Board by moio
Use tabletopgames.ai’s open source TAG and PyTAG frameworks to apply Statistical Forward Planning and Deep Reinforcement Learning to two board games of our own design. On an all-green, all-open source, all-AWS stack!
Results: Infrastructure Achievements
We successfully built and automated a containerized stack to support our AI experiments. This included:
- a Fully-Automated, One-Command, GPU-accelerated Kubernetes setup: we created an OpenTofu based script, tofu-tag, to deploy SUSE's RKE2 Kubernetes running on CUDA-enabled nodes in AWS, powered by openSUSE with GPU drivers and gpu-operator
- Containerization of the TAG and PyTAG frameworks: TAG (Tabletop AI Games) and PyTAG were patched for seamless deployment in containerized environments. We automated the container image creation process with GitHub Actions. Our forks (PRs upstream upcoming):
./deploy.sh
and voilà - Kubernetes running PyTAG (k9s
, above) with GPU acceleration (nvtop
, below)
Results: Game Design Insights
Our project focused on modeling and analyzing two card games of our own design within the TAG framework:
- Game Modeling: We implemented models for Dario's "Bamboo" and Silvio's "Totoro" and "R3" games, enabling AI agents to play thousands of games ...in minutes!
- AI-driven optimization: By analyzing statistical data on moves, strategies, and outcomes, we iteratively tweaked the game mechanics and rules to achieve better balance and player engagement.
- Advanced analytics: Leveraging AI agents with Monte Carlo Tree Search (MCTS) and random action selection, we compared performance metrics to identify optimal strategies and uncover opportunities for game refinement .
- more about Bamboo on Dario's site
- more about R3 on Silvio's site (italian, translation coming)
- more about Totoro on Silvio's site
A family picture of our card games in progress. From the top: Bamboo, Totoro, R3
Results: Learning, Collaboration, and Innovation
Beyond technical accomplishments, the project showcased innovative approaches to coding, learning, and teamwork:
- "Trio programming" with AI assistance: Our "trio programming" approach—two developers and GitHub Copilot—was a standout success, especially in handling slightly-repetitive but not-quite-exactly-copypaste tasks. Java as a language tends to be verbose and we found it to be fitting particularly well.
- AI tools for reporting and documentation: We extensively used AI chatbots to streamline writing and reporting. (Including writing this report! ...but this note was added manually during edit!)
- GPU compute expertise: Overcoming challenges with CUDA drivers and cloud infrastructure deepened our understanding of GPU-accelerated workloads in the open-source ecosystem.
- Game design as a learning platform: By blending AI techniques with creative game design, we learned not only about AI strategies but also about making games fun, engaging, and balanced.
Last but not least we had a lot of fun! ...and this was definitely not a chatbot generated line!
The Context: AI + Board Games
ghostwrAIter - a local AI assisted tool for helping with support cases by paolodepa
Description
This project is meant to fight the loneliness of the support team members, providing them an AI assistant (hopefully) capable of scraping supportconfigs in a RAG fashion, trying to answer specific questions.
Goals
- Setup an Ollama backend, spinning one (or more??) code-focused LLMs selected by license, performance and quality of the results between:
- deepseek-coder-v2
- dolphin-mistral
- starcoder2
- (...others??)
- Setup a Web UI for it, choosing an easily extensible and customizable option between:
- Extend the solution in order to be able to:
- Add ZIU/Concord shared folders to its RAG context
- Add BZ cases, splitted in comments to its RAG context
- A plus would be to login using the IDP portal to ghostwrAIter itself and use the same credentials to query BZ
- Add specific packages picking them from IBS repos
- A plus would be to login using the IDP portal to ghostwrAIter itself and use the same credentials to query IBS
- A plus would be to desume the packages of interest and the right channel and version to be picked from the added BZ cases
Use local/private LLM for semantic knowledge search by digitaltomm
Description
Use a local LLM, based on SUSE AI (ollama, openwebui) to power geeko search (public instance: https://geeko.port0.org/).
Goals
Build a SUSE internal instance of https://geeko.port0.org/ that can operate on internal resources, crawling confluence.suse.com, gitlab.suse.de, etc.
Resources
Repo: https://github.com/digitaltom/semantic-knowledge-search
Public instance: https://geeko.port0.org/
Results
Internal instance:
I have an internal test instance running which has indexed a couple of internal wiki pages from the SCC team. It's using the ollama (llama3.1:8b
) backend of suse-ai.openplatform.suse.com to create embedding vectors for indexed resources and to create a chat response. The semantic search for documents is done with a vector search inside of sqlite, using sqlite-vec.