Project Description
Open source voice controlled AI drawing interface for young kids
web voice recorder->speech2text->translator->parser->Midjourney/Stable diffusion->web picture presentation
The goal is to create simple to use web based interface, that would allow small kids to use Midjourney AI for painting images even if they can't write or speak English yet. I have two kids 3 and 6 years old who can speak only Czech, have creative ideas worth supporting, like to play with technology and currently Midjourney is unusable for them. I think, that this solution could be popular around the world, as a side effect could bring attention to SUSE and opensource and could be used in various schools and creative workshops.
The idea is to chain APIs of existing services and make this project easily reproducible for others, who would like to run their instance with their own account or server.
The primary user group of our particular instance would be SUSE employees and their kids. The concept can then be used to promote opensource and raise interest in technology in kids and their parents on various public occasions.
Goal for this Hackweek
Designing proof of concept solution. Implementing minimal viable product. Testing user interface with group of kids from 3 to 6 years old.
There are existing solutions, that can be further improved in order to achieve project goals: https://medium.com/geekculture/voice-assisted-image-generation-with-stable-diffusion-66b7facd8fc4
Resources
For this project I am looking for contributors: Frontend developer Backend developer Devops engineer Architect of general functionality UX designer
Keywords: AI , Midjourney , drawing , painting , young , kids , integration , imagination , fun
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 23
Comments
-
about 1 year ago by mpiala | Reply
@jstehlik we have access to AWS Bedrock. If you are interested, let me know, we can provide you with access, we even plan some other AI projects during the hackweek, we should have some support by AWS by hand as well.
-
about 1 year ago by jstehlik | Reply
Proof of concept was created using AWS services, which has proven to be relatively easy but costly (roughly 0.2 USD for a good picture). Thus I installed stable diffusion and whisper translator on local machine. That works about 10 times slower on nvidia gtx 1660 GPU, but serves better the purpose of semiautomatic kiosk and gives more freedom in configuration at lower price.
Similar Projects
This project is one of its kind!