Project Description

Currently openQA requires a reference image to be stored to do OCR based comparisons. It is not possible to pass a character string to openQA which should be compared to the text in the screenshot. This project is about allowing to just store character strings in the corresponding JSON file of the needle and to get rid of any reference images in case of OCR needles.

Status

Research about possible tools was done. The result was that the current implementation based on Tesseract appears to be too inaccurate on short character strings. The program GOCR seems to do more classical recognition by shape which seems to work reasonably accurate on well shaped characters. The accuracy of the matched strings could be calculated using the library perl-Text-Levenshtein.

Goal for this Hackweek

  • Create draft implementation of OCR in os-autoinst.
  • Optional: Create easy handling of text based OCR needles in openQA web frontend (e.g. providing live preview of recognized text)

Resources

  • This project is tracked here: https://progress.opensuse.org/issues/121354
  • openQA frontend repo: https://github.com/os-autoinst/openQA
  • openQA backend repo: https://github.com/os-autoinst/os-autoinst
  • GOCR: https://wasd.urz.uni-magdeburg.de/jschulen/ocr/
  • Perl-Text-Levenshtein: https://github.com/neilb/Text-Levenshtein

Looking for hackers with the skills:

openqa mojolicious perl ocr os-autoinst

This project is part of:

Hack Week 22

Activity

  • over 1 year ago: okurz liked this project.
  • over 1 year ago: jzerebecki liked this project.
  • over 1 year ago: pdostal liked this project.
  • over 1 year ago: mkoutny liked this project.
  • over 1 year ago: ggardet_arm left this project.
  • over 1 year ago: ggardet_arm joined this project.
  • over 1 year ago: ggardet_arm liked this project.
  • over 1 year ago: robert.richardson liked this project.
  • over 1 year ago: dancermak liked this project.
  • over 1 year ago: ybonatakis liked this project.
  • over 1 year ago: clanig started this project.
  • over 1 year ago: clanig added keyword "openqa" to this project.
  • over 1 year ago: clanig added keyword "mojolicious" to this project.
  • over 1 year ago: clanig added keyword "perl" to this project.
  • over 1 year ago: clanig added keyword "ocr" to this project.
  • over 1 year ago: clanig added keyword "os-autoinst" to this project.
  • over 1 year ago: clanig originated this project.

  • Comments

    • okurz
      over 1 year ago by okurz | Reply

      There is very basic support for OCR in os-autoinst with https://github.com/os-autoinst/os-autoinst/blob/master/ocr.pm which might give you some good ideas and a starting base. https://github.com/os-autoinst/os-autoinst/blob/master/t/02-test_ocr.t shows its usage

    • clanig
      over 1 year ago by clanig | Reply

      Created draft PR: https://github.com/os-autoinst/os-autoinst/pull/2276

    Similar Projects

    This project is one of its kind!