Project Description

Currently openQA requires a reference image to be stored to do OCR based comparisons. It is not possible to pass a character string to openQA which should be compared to the text in the screenshot. This project is about allowing to just store character strings in the corresponding JSON file of the needle and to get rid of any reference images in case of OCR needles.

Status

Research about possible tools was done. The result was that the current implementation based on Tesseract appears to be too inaccurate on short character strings. The program GOCR seems to do more classical recognition by shape which seems to work reasonably accurate on well shaped characters. The accuracy of the matched strings could be calculated using the library perl-Text-Levenshtein.

Goal for this Hackweek

  • Create draft implementation of OCR in os-autoinst.
  • Optional: Create easy handling of text based OCR needles in openQA web frontend (e.g. providing live preview of recognized text)

Resources

  • This project is tracked here: https://progress.opensuse.org/issues/121354
  • openQA frontend repo: https://github.com/os-autoinst/openQA
  • openQA backend repo: https://github.com/os-autoinst/os-autoinst
  • GOCR: https://wasd.urz.uni-magdeburg.de/jschulen/ocr/
  • Perl-Text-Levenshtein: https://github.com/neilb/Text-Levenshtein

Looking for hackers with the skills:

openqa mojolicious perl ocr os-autoinst

This project is part of:

Hack Week 22

Activity

  • almost 3 years ago: okurz liked this project.
  • almost 3 years ago: jzerebecki liked this project.
  • almost 3 years ago: pdostal liked this project.
  • almost 3 years ago: mkoutny liked this project.
  • almost 3 years ago: ggardet_arm left this project.
  • almost 3 years ago: ggardet_arm joined this project.
  • almost 3 years ago: ggardet_arm liked this project.
  • almost 3 years ago: robert.richardson liked this project.
  • almost 3 years ago: dancermak liked this project.
  • almost 3 years ago: ybonatakis liked this project.
  • almost 3 years ago: clanig started this project.
  • almost 3 years ago: clanig added keyword "openqa" to this project.
  • almost 3 years ago: clanig added keyword "mojolicious" to this project.
  • almost 3 years ago: clanig added keyword "perl" to this project.
  • almost 3 years ago: clanig added keyword "ocr" to this project.
  • almost 3 years ago: clanig added keyword "os-autoinst" to this project.
  • almost 3 years ago: clanig originated this project.

  • Comments

    • okurz
      almost 3 years ago by okurz | Reply

      There is very basic support for OCR in os-autoinst with https://github.com/os-autoinst/os-autoinst/blob/master/ocr.pm which might give you some good ideas and a starting base. https://github.com/os-autoinst/os-autoinst/blob/master/t/02-test_ocr.t shows its usage

    • clanig
      over 2 years ago by clanig | Reply

      Created draft PR: https://github.com/os-autoinst/os-autoinst/pull/2276

    Similar Projects

    openQA log viewer by mpagot

    Description

    *** Warning: Are You at Risk for VOMIT? ***

    Do you find yourself staring at a screen, your eyes glossing over as thousands of lines of text scroll by? Do you feel a wave of text-based nausea when someone asks you to "just check the logs"?

    You may be suffering from VOMIT (Verbose Output Mental Irritation Toxicity).

    This dangerous, work-induced ailment is triggered by exposure to an overwhelming quantity of log data, especially from parallel systems. The human brain, not designed to mentally process 12 simultaneous autoinst-log.txt files, enters a state of toxic shock. It rejects the "Verbose Output," making it impossible to find the one critical error line buried in a 50,000-line sea of "INFO: doing a thing."

    Before you're forced to rm -rf /var/log in a fit of desperation, we present the digital antacid.

    No panic: The openQA Log Visualizer (Also known as the "VOMIT-B-Gone 9000")

    This is your web-based hazmat suit for handling toxic log environments. It bravely dives into the chaotic, multi-machine mess of your openQA test runs, finds all the related, verbose logs, and force-feeds them into a parser.

    image

    Goals

    Work on the existing POC openqa-log-visualizer and change it to something usable

    Resources

    openqa-log-visualizer


    Create a page with all devel:languages:perl packages and their versions by tinita

    Description

    Perl projects now live in git: https://src.opensuse.org/perl

    It would be useful to have an easy way to check which version of which perl module is in devel:languages:perl. Also we have meta overrides and patches for various modules, and it would be good to have them at a central place, so it is easier to lookup, and we can share with other vendors.

    I did some initial data dump here a while ago: https://github.com/perlpunk/cpan-meta

    But I never had the time to automate this.

    I can also use the data to check if there are necessary updates (currently it uses data from download.opensuse.org, so there is some delay and it depends on building).

    Goals

    • Have a script that updates a central repository (e.g. https://src.opensuse.org/perl/_metadata) with metadata by looking at https://src.opensuse.org/perl/_ObsPrj (check if there are any changes from the last run)
    • Create a HTML page with the list of packages (use Javascript and some table library to make it easily searchable)

    Resources