SUSE Hack Week: Can we (machine) learn from bug reports?

Bug reports can be a great source of information, but usually finding the information requires extensive work in reading through all of the discussions and understanding the details about it.

Could it be that machine learning can be used to extract meaningful information out of that? That's what this project is about. The idea is to explore some different methods and see what the results are.

Here are some rough ideas on what to try:

clustering
sentiment analysis
filtering

As a dataset, the plan is to collect SLE bugs and openSUSE bugs from our very own bugzilla and use this data to train/validate some models.

Join this project Leave this project

Looking for hackers with the skills:

Nothing? Add some keywords!

This project is part of:

Hack Week 20

Activity

about 1 year ago: bfilho liked this project.

almost 5 years ago: ONalmpantis liked this project.

almost 5 years ago: acho liked this project.

almost 5 years ago: llansky3 liked this project.

almost 5 years ago: jufa liked this project.

almost 5 years ago: ories liked this project.

almost 5 years ago: mlnoga liked this project.

almost 5 years ago: j_renner liked this project.

almost 5 years ago: moio liked this project.

almost 5 years ago: hennevogel disliked this project.

almost 5 years ago: hennevogel liked this project.

almost 5 years ago: hennevogel disliked this project.

almost 5 years ago: hennevogel liked this project.

almost 5 years ago: hennevogel disliked this project.

almost 5 years ago: hennevogel liked this project.

almost 5 years ago: gboiko started this project.

almost 5 years ago: gboiko originated this project.

Comments

almost 5 years ago by alnovak | Reply

I see two large sources of data that would be useful to include:
- supportconfigs - these are either attached to Bugzilla, or available (short-term) on a filesystem - present great overview of our customers' environment
- L3 metadata - for L3 bugs (~ 3000 / year), there are data that may be highly relevant for the clustering as well, among other:
  - customer identification
  - what PTFs (fixed packages) were delivered in the case, what was the feedback on those
- almost 5 years ago by gboiko | Reply
  
  Hi @alnovak
  
  Thank you for your feedback! I will try to include those as well in the analysis.
  
  Even though I already had in mind L3 metadata, I haven't thought of supportconfigs, nice hint, thank you!

almost 5 years ago by mslacken | Reply

I had the same idea last year, but did not really succeed. You might want to have a look at: https://github.com/mslacken/ml-bugs I also gave a talk at the Super Compute 2019: https://gitlab.suse.de/mslacken/sc-2019 Feel free to ping me, if you need any additional information.
- almost 5 years ago by gboiko | Reply
  
  Hi @mslacken
  
  Thank you for the pointers. I will take them a quick look and then I will certainly ping you about it.
  
  Enjoy hackweek :)

Similar Projects

This project is one of its kind!

Looking for hackers with the skills:

This project is part of:

Activity

Comments

almost 5 years ago by alnovak | Reply

almost 5 years ago by gboiko | Reply

almost 5 years ago by mslacken | Reply

almost 5 years ago by gboiko | Reply

Similar Projects