I was talking with a friend the other day who is blind. He briefly explained to me how he reads books (the regular, paper printed ones). So, he is taking a photo of each page, passes that to the OCR to extract the text in digital form, then passes that to some text to speech engine to read it out loud.
One of the problems he is facing is that the OCR program he has purchased doesn't handle Unicode letters very well when they have accents on them (e.g. "ή" in "Δημήτρης" would not be recognized most of the time). The other problem is how "manual" the whole process is. Nowadays there are open source ocr solutions that can do a wonderful job with Unicode. There are also APIs that can be used (e.g. https://cloud.google.com/vision/docs/ocr). The whole process can be very much automated too. Maybe there are tools out there that work, but they are probably not free.
I feel there is a problem of motivation in developing software for visually impaired people. The people who care the most about it (e.g. blind people), are the people who can't write it and the people who can write it (people who can see) don't have a need for it. This problem can be easily solved if people who can produce software, do it for the people who can't.
Goal for this Hackweek
So here is (roughly) what I had in mind:
You got a Rasberry pi with a camera attached to it and a hardware button connected to some IO pin. You put the book in front of the camera (maybe on a permanent stand). Every time the user clicks the button, a program does the following automatically:
- detects the books page and aligns it and cuts the non-paper part (optional, it can be the user that has to align it properly, once)
- runs some filters on the image to make it more readable
- passes the image through OCR (open source or some remote API)
- makes guesses and corrections on the text (optional, not sure if there are free tools for this)
- passes the text to a text to speech engine (or API based: https://cloud.google.com/text-to-speech)
It can also support translation of the text in the future (through some online API).
This is meant to be a proof of concept and should be done in one week (Hackweek). One could easily see how a mobile application that does the same job may be easier to distribute and use but it's also more complicated to write. If the proof of concept seems useful and if there is interest, this project could expand its goals.
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 21
This project is one of its kind!