It is well-known that two git commits within a single repo can be independent from each other, by changing separate files to each other, or changing separate parts of the same file(s). Conversely when a commit changes a line, it is "dependent" on not only the commit which last changed that line, but also any commits which were responsible for providing the surrounding lines of context, because without those previous versions of the line and its context, the commit's diff would not cleanly apply.
As with most dependency relationships, these form a directed acyclic graph. Sometimes it is useful to understand the nature of parts of this graph; for example when porting a commit "A" between git branches via git cherry-pick, it can be useful to programmatically determine in advance the minimum number of other dependent commits which would also need to be cherry-picked to provide the context for commit "A" to cleanly apply.
Another use case might be to better understand levels of specialism / cross-functionality within an agile team. If I author a commit which modifies (say) lines 34-37 and 102-109 of a file, the authors of the dependent commits forms a list which indicates the group of people I should potentially consider asking to review my commit, since I'm effectively changing "their" code. Monitoring those relationships over time might shed some light on how agile teams should best coordinate efforts on shared code bases.
I'm sure there are other use cases I haven't yet thought of.
I have written a tool called git deps which automatically walks this graph. Currently the output is text only, but it would be cool to visualise the results, e.g. by generating a static HTML page which uses d3.js to provide a force-directed, zoomable layout of the graph where individual nodes can be hovered over or clicked to expand to see the commit details. It would be nice to colour the nodes according to the commit author. There is a decent amount of prior art for visualizing dependency graphs in this way, e.g.
- http://marvl.infotech.monash.edu/webcola/
- http://kenneth.io/blog/2013/04/01/visualize-your-javaScript-dependencies-with-dependo/
- http://bl.ocks.org/mbostock/1153292
Also the tool should make better use of pygit2 since blame support was not complete when it was originally written.
(BTW the dependency graph is likely to be semantically incomplete; for example it would not auto-detect dependencies between a commit which changes code and another commit which changes documentation or tests to reflect this code. (Incidentally this is one reason why it is usually a very good idea to logically group such changes together in a single commit.)
Looking for hackers with the skills:
This project is part of:
Hack Week 11
Activity
Comments
-
-
about 11 years ago by vitezslav_cizek | Reply
Not just the kernel team, but essentially any packager who needs to backport a fix, which happens quite often (eg. security bugs). I miss a tool like this and i'm pleased to learn about your git-deps.
-
Similar Projects
Mail client with mailing list workflow support in Rust by acervesato
Description
To create a mail user interface using Rust programming language, supporting mailing list patches workflow. I know, aerc is already there, but I would like to create something simpler, without integrated protocols. Just a plain user interface that is using some crates to read and create emails which are fetched and sent via external tools.
I already know Rust, but not the async support, which is needed in this case in order to handle events inside the mail folder and to send notifications.
Goals
- simple user interface in the style of
aerc, with some vim keybindings for motions and search - automatic run of external tools (like
mbsync) for checking emails - automatic run commands for notifications
- apply patch set from ML
- tree-sitter support with styles
Resources
- ratatui: user interface (https://ratatui.rs/)
- notify: folder watcher (https://docs.rs/notify/latest/notify/)
- mail-parser: parser for emails (https://crates.io/crates/mail-parser)
- mail-builder: create emails in proper format (https://docs.rs/mail-builder/latest/mail_builder/)
- gitpatch: ML support (https://crates.io/crates/gitpatch)
- tree-sitter-rust: support for mail format (https://crates.io/crates/tree-sitter)
Create a page with all devel:languages:perl packages and their versions by tinita
Description
Perl projects now live in git: https://src.opensuse.org/perl
It would be useful to have an easy way to check which version of which perl module is in devel:languages:perl. Also we have meta overrides and patches for various modules, and it would be good to have them at a central place, so it is easier to lookup, and we can share with other vendors.
I did some initial data dump here a while ago: https://github.com/perlpunk/cpan-meta
But I never had the time to automate this.
I can also use the data to check if there are necessary updates (currently it uses data from download.opensuse.org, so there is some delay and it depends on building).
Goals
- Have a script that updates a central repository (e.g.
https://src.opensuse.org/perl/_metadata) with metadata by looking at https://src.opensuse.org/perl/_ObsPrj (check if there are any changes from the last run) - Create a HTML page with the list of packages (use Javascript and some table library to make it easily searchable)
Resources
Results
Day 1
- First part of the code which retrieves data from https://src.opensuse.org/perl/_ObsPrj with submodules and creates a YAML and a JSON file.
- Repo: https://github.com/perlpunk/opensuse-perl-meta
- Also a first version of the HTML is live: https://perlpunk.github.io/opensuse-perl-meta/
Day 2
- HTML Page has now links to src.opensuse.org and the date of the last update, plus a short info at the top
- Code is now 100% covered by tests: https://app.codecov.io/gh/perlpunk/opensuse-perl-meta
- I used the modern perl
classfeature, which makes perl classes even nicer and shorter. See example - Tests
- I tried out the mocking feature of the modern Test2::V0 library which provides call tracking. See example
- I tried out comparing data structures with the new Test2::V0 library. It let's you compare parts of the structure with the
likefunction, which only compares the date that is mentioned in the expected data. example
Day 3
- Added various things to the table
- Dependencies column
- Show popup with info for cpanspec, patches and dependencies
- Added last date / commit to the data export.
Plan: With the added date / commit we can now daily check _ObsPrj for changes and only fetch the data for changed packages.
Day 4
go-git: unlocking SHA256-based repository cloning ahead of git v3 by pgomes
Description
The go-git library implements the git internals in pure Go, so that any Go application can handle not only Git repositories, but also lower-level primitives (e.g. packfiles, idxfiles, etc) without needing to shell out to the git binary.
The focus for this Hackweek is to fast track key improvements for the project ahead of the upstream release of Git V3, which may take place at some point next year.
Goals
- Add support for cloning SHA256 repositories.
- Decrease memory churn for very large repositories (e.g. Linux Kernel repository).
- Cut the first alpha version for
go-git/v6.
Stretch goals
- Review and update the official documentation.
- Optimise use of go-git in Fleet.
- Create RFC/example for go-git plugins to improve extensibility.
- Investigate performance bottlenecks for Blame and Status.
Resources
- https://github.com/go-git/go-git/
- https://go-git.github.io/docs/
git-fs: file system representation of a git repository by fgonzalez
Description
This project aims to create a Linux equivalent to the git/fs concept from git9. Now, I'm aware that git provides worktrees, but they are not enough for many use cases. Having a read-only representation of the whole repository simplifies scripting by quite a bit and, most importantly, reduces disk space usage. For instance, during kernel livepatching development, we need to process and analyze the source code of hundreds of kernel versions simultaneously.This is rather painful with git-worktrees, as each kernel branch requires no less than 1G of disk space.
As for the technical details, I'll implement the file system using FUSE. The project itself should not take much time to complete, but let's see where it takes me.
I'll try to keep the same design as git9, so the file system will look something like:
/mnt/git
+-- ctl
+-- HEAD
| +-- tree
| | +--files
| | +--in
| | +--head
| |
| +-- hash
| +-- msg
| +-- parent
|
+-- branch
| |
| +-- heads
| | +-- master
| | +-- [commit files, see HEAD]
| +-- remotes
| +-- origin
| +-- master
| +-- [commit files, see HEAD]
+-- object
+-- 00051fd3f066e8c05ae7d3cf61ee363073b9535f # blob contents
+-- 00051fd3f066e8c05ae7d3cf61ee363073b9535c
+-- [tree contents, see HEAD/tree]
+-- 3f5dbc97ae6caba9928843ec65fb3089b96c9283
+-- [commit files, see HEAD]
So, if you wanted to look at the commit message of the current branch, you could simply do:
cat /mnt/git/HEAD/msg
No collaboration needed. This is a solo project.
Goals
Implement a working prototype.
Measure and improve the performance if possible. This step will be the most crucial one. User space filesystems are slower by nature.
Resources
https://docs.kernel.org/filesystems/fuse/fuse.html
Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak
Description
git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.
Supports:
- Verifying commit authenticity signed using gpg key
- Checking out trusted commits
Ideal for teams and projects where the integrity of git history is crucial.
Goals
A minimal python code of the shell script exists as a pull request.
The goal of this hackweek is to:
- DONE: Add more unit tests
- New and more tests can be added later
- New and more tests can be added later
- Partially DONE: Make the python code modular
- DONE: Add code coverage if possible
Resources
- Link to GitHub Repository: https://github.com/openSUSE/git-sha-verify
Improvements to osc (especially with regards to the Git workflow) by mcepl
Description
There is plenty of hacking on osc, where we could spent some fun time. I would like to see a solution for https://github.com/openSUSE/osc/issues/2006 (which is sufficiently non-serious, that it could be part of HackWeek project).
Improve chore and screen time doc generator script `wochenplaner` by gniebler
Description
I wrote a little Python script to generate PDF docs, which can be used to track daily chore completion and screen time usage for several people, with one page per person/week.
I named this script wochenplaner and have been using it for a few months now.
It needs some improvements and adjustments in how the screen time should be tracked and how chores are displayed.
Goals
- Fix chore field separation lines
- Change screen time tracking logic from "global" (week-long) to daily subtraction and weekly addition of remainders (more intuitive than current "weekly time budget method)
- Add logic to fill in chore fields/lines, ideally with pictures, falling back to text.
Resources
tbd (Gitlab repo)
Collection and organisation of information about Bulgarian schools by iivanov
Description
To achieve this it will be necessary:
- Collect/download raw data from various government and non-governmental organizations
- Clean up raw data and organise it in some kind database.
- Create tool to make queries easy.
- Or perhaps dump all data into AI and ask questions in natural language.
Goals
By selecting particular school information like this will be provided:
- School scores on national exams.
- School scores from the external evaluations exams.
- School town, municipality and region.
- Employment rate in a town or municipality.
- Average health of the population in the region.
Resources
Some of these are available only in bulgarian.
- https://danybon.com/klasazia
- https://nvoresults.com/index.html
- https://ri.mon.bg/active-institutions
- https://www.nsi.bg/nrnm/ekatte/archive
Results
- Not fully ready.
- Information about all Bulgarian schools with their scores during recent years cleaned and organised into SQL tables
- Information about all Bulgarian villages, cities, municipalities and districts cleaned and organised into SQL tables
- Information about all Bulgarian villages and cities census since beginning of this century cleaned and organised into SQL tables.
- Information about all Bulgarian municipalities about religion, ethnicity cleaned and organised into SQL tables.
- Data successfully loaded to locally running Ollama with help to Vanna.AI
TODO
- Seems that tables columns have to have better description to allow LLM to do its magic
- Add comments to column names. This will require switch from SQLite to ... PostgreSQL which supports COMMENTS
- Add more statistical information about municipalities and ....
Code and data
Song Search with CLAP by gcolangiuli
Description
Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface
Goals
Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:
- Music Tagging;
- Free text search;
- Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.
The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.
Result
In this MVP we implemented:
- Async Song Analysis with Clap model
- Free Text Search of the songs
- Similar song search based on vector representation
- Containerised version with web interface
We also documented what went well and what can be improved in the use of AI.
You can have a look at the result here:
Future implementation can be related to performance improvement and stability of the analysis.
References
- CLAP: The main model being researched;
- huggingface: Pre-trained models for CLAP;
- Free Music Archive: Creative Commons songs that can be used for testing;
Kudos aka openSUSE Recognition Platform by lkocman
Description
Relevant blog post at news-o-o
I started the Kudos application shortly after Leap 16.0 to create a simple, friendly way to recognize people for their work and contributions to openSUSE. There’s so much more to our community than just submitting requests in OBS or gitea we have translations (not only in Weblate), wiki edits, forum and social media moderation, infrastructure maintenance, booth participation, talks, manual testing, openQA test suites, and more!
Goals
Kudos under github.com/openSUSE/kudos with build previews aka netlify
Have a kudos.opensuse.org instance running in production
Build an easy-to-contribute recognition platform for the openSUSE community a place where everyone can send and receive appreciation for their work, across all areas of contribution.
In the future, we could even explore reward options such as vouchers for t-shirts or other community swag, small tokens of appreciation to make recognition more tangible.
Resources
(Do not create new badge requests during hackweek, unless you'll make the badge during hackweek)
- Source code: openSUSE/kudos
- Badges: openSUSE/kudos-badges
- Issue tracker: kudos/issues