For far too long openQA instances are crippled by insufficient I/O throughput when under heavy load. This results in incomplete and aborted tests prolonging new snapshots testing and also adding confusion into the release process.
OpenQA itself doesn't require any exact technology for its remote workers, but so far we relied on NFS. This was fine for few workers, now with 50+ of them this stopped to be a viable route however. There are some custom made mitigations like rsyncing tests, assets and needles before the actual test job run, but this feels like reinventing the wheel to me.
During this project I aim to get familiar with various distributed file systems with emphasis on how well they work within I/O heavy environment starting with our own Ceph based storage solution. Target is to document the efforts so that it can later be used to solve (or not to use in) our production setup.
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 15
This project is one of its kind!