SUSE Hack Week: Detect type of change in a project analyzing the log history

Use machine learning and natural language processing techniques to analyze the changes made in a project, and classify them in:

For this project I will

Generate a basic corpus of labeled data from a different set of project related with openSUSE
Evaluate the best features to make a proper classification: n-gram, PoS tag, TF-IDF (with and without stemmer)
Evaluate and measure the best classification model: Naive Bayes, Linear SVM, Max Entropy, ...

Hack Week 10 Hack Week 11 Hack Week 12

over 8 years ago: jordimassaguerpla liked this project.

about 9 years ago: nicolasbock liked this project.

over 10 years ago: ZRen disliked this project.

over 10 years ago: ZRen liked this project.

over 10 years ago: bkutil liked this project.

almost 11 years ago: cschum liked this project.

almost 11 years ago: froh joined this project.

almost 11 years ago: vitezslav_cizek liked this project.

almost 11 years ago: froh liked this project.

almost 11 years ago: oholecek liked this project.

almost 12 years ago: aplanas liked this project.

almost 12 years ago: aplanas started this project.

almost 12 years ago: aplanas added keyword "nlp" to this project.

almost 12 years ago: aplanas added keyword "machinelearning" to this project.

almost 12 years ago: aplanas added keyword "git" to this project.

almost 12 years ago: aplanas added keyword "github" to this project.

almost 12 years ago: aplanas originated this project.

almost 11 years ago by aplanas | Reply

Yeah. Hackweek 10 collied with openSUSE 13.1, so I will try to for on this during this new Hackweek instance : )

almost 11 years ago by froh | Reply

Would it be hard to train for regression fix vs new feature, based on the comment? I'd be curious how much energy project have to put into regression fixes vs feature additions.

over 10 years ago by osynge | Reply

Have you considered looking at ELK and integrating this work in the ELK stack.

This project is one of its kind!