Project Description
Currently openQA requires a reference image to be stored to do OCR based comparisons. It is not possible to pass a character string to openQA which should be compared to the text in the screenshot. This project is about allowing to just store character strings in the corresponding JSON file of the needle and to get rid of any reference images in case of OCR needles.
Status
Research about possible tools was done. The result was that the current implementation based on Tesseract appears to be too inaccurate on short character strings. The program GOCR seems to do more classical recognition by shape which seems to work reasonably accurate on well shaped characters. The accuracy of the matched strings could be calculated using the library perl-Text-Levenshtein.
Goal for this Hackweek
- Create draft implementation of OCR in os-autoinst.
- Optional: Create easy handling of text based OCR needles in openQA web frontend (e.g. providing live preview of recognized text)
Resources
- This project is tracked here: https://progress.opensuse.org/issues/121354
- openQA frontend repo: https://github.com/os-autoinst/openQA
- openQA backend repo: https://github.com/os-autoinst/os-autoinst
- GOCR: https://wasd.urz.uni-magdeburg.de/jschulen/ocr/
- Perl-Text-Levenshtein: https://github.com/neilb/Text-Levenshtein
Looking for hackers with the skills:
This project is part of:
Hack Week 22
Activity
Comments
Similar Projects
openQA log viewer by mpagot
Description
*** Warning: Are You at Risk for VOMIT? ***
Do you find yourself staring at a screen, your eyes glossing over as thousands of lines of text scroll by? Do you feel a wave of text-based nausea when someone asks you to "just check the logs"?
You may be suffering from VOMIT (Verbose Output Mental Irritation Toxicity).
This dangerous, work-induced ailment is triggered by exposure to an overwhelming quantity of log data, especially from parallel systems. The human brain, not designed to mentally process 12 simultaneous autoinst-log.txt files, enters a state of toxic shock. It rejects the "Verbose Output," making it impossible to find the one critical error line buried in a 50,000-line sea of "INFO: doing a thing."
Before you're forced to rm -rf /var/log in a fit of desperation, we present the digital antacid.
No panic: The openQA Log Visualizer (Also known as the "VOMIT-B-Gone 9000")
This is your web-based hazmat suit for handling toxic log environments. It bravely dives into the chaotic, multi-machine mess of your openQA test runs, finds all the related, verbose logs, and force-feeds them into a parser.
Goals
Work on the existing POC openqa-log-visualizer and change it to something usable
Resources
Create a page with all devel:languages:perl packages and their versions by tinita
Description
Perl projects now live in git: https://src.opensuse.org/perl
It would be useful to have an easy way to check which version of which perl module is in devel:languages:perl. Also we have meta overrides and patches for various modules, and it would be good to have them at a central place, so it is easier to lookup, and we can share with other vendors.
I did some initial data dump here a while ago: https://github.com/perlpunk/cpan-meta
But I never had the time to automate this.
I can also use the data to check if there are necessary updates (currently it uses data from download.opensuse.org, so there is some delay and it depends on building).
Goals
- Have a script that updates a central repository (e.g.
https://src.opensuse.org/perl/_metadata) with metadata by looking at https://src.opensuse.org/perl/_ObsPrj (check if there are any changes from the last run) - Create a HTML page with the list of packages (use Javascript and some table library to make it easily searchable)
Resources