The network connection to ibs from Beijing is quite slow. And ibs doesn't supply the search function of tags in spec files, like versions, requirements, or patches, etc...
Tasks to complete the projects:
- Spider the specs in ibs.
- Already implemented a spider of another web via Python, so plan to choose golang without any 3rd party framework to fetch. To familiar with goroutine for parallel work, though threads may more effective since network should be the bottleneck so far...
- Save the results into database with tags like projects, package name, etc...
- Not decided yet which database to pick, or even sql/nosql model... Current idea is mysql/memcached or redis, depends on how to compose the tables/key of database.
- Analyse and polish the code to make it more fast.
- Try to make the local engineer become high availability cluster, low latency and higher throughput. Using Galera or Pacemaker for availability? nginx or haproxy for load balance? In memory database for accelerating the read? No idea at the moment, should depends on the bottleneck and the needs. Though a simple engineer for internal won't require much, but may learn more while improving/thinking about the questions.
- Check and compare the performance like TPS...
- A nice web page to show the result?
- It is not the 1st priority task of me... let's see in future:)
The target of the project are:
- Learning and familiar with golang. Currently i have nearly 0 experience on golang for a real project:)
- Learn more while constantly questioning the system.
- Have fun.
I don't think the project can be finished within 1 week. Hopefully i can at least get something after that and continue the task when have spare time. First to do it and have fun:)
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 17
This project is one of its kind!