Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Use natural language processing to parse opening hours #531

Closed
RayBB opened this issue Jan 27, 2023 · 2 comments
Closed

Comments

@RayBB
Copy link

RayBB commented Jan 27, 2023

Hello,

I'll keep this short and sweet.
Would you be open to using NLP (https://wiki.openstreetmap.org/wiki/Natural_language_processing) to try to parse opening hours and convert them to the the correct format for OSM?
Something that's pretty much like this? https://www.webmapping.cyou/WebToOSMOH/

I think it would be really handy as I find entering hours (especially more complex ones) to be one of the tasks I dread most even though I think you've made the UI great.

If you're not considering supporting it, I might try to make a PR to the repo of the above example to add mobile support so I can use it to on my phone. OSM-de/WebToOSMOH#12

I think the ideal workflow would be something like:

  1. I take a photo
  2. I copy/paste the text from photo using OCR (or everydoor does it)
  3. Everydoor parses the language into the format for OSM
  4. I very/adjust the opening hours

Cheers and thanks so much for your hard work on this app, I really love it :)

@mnalis
Copy link
Contributor

mnalis commented Jan 27, 2023

Discussion in similar project you may want to read - about the scope of the problem: streetcomplete/StreetComplete#4222, streetcomplete/StreetComplete#1186, or bryceco/GoMap#227

In short, it is likely to be incredibly complex:

  • firstly you need to have very good OCR, capable of handling skewing, rotation, glass reflections, blurring, tons of different fonts, handwriting, etc. Preferably offline on a phone. We do not seem to be nowhere near that technologically. There are a lot of pictures online so feel free to try it yourself how your favorite OCR handles it.
  • assuming that such OCR had at least 95%+ success rate (although to be usable it would be have to be at least 99.9% accurate, or you'll be spending more time verifying and fixing it then typing it in from scratch) above, then you have to have AI which will be able to parse the formatting - how the table is layed out vertically / horizontally, which data fits which (often invisible !) columns, which is unrelated text like phone number etc.
  • then when you have all the correct text (step 1) in the the correct order (step 2), you have to have something which will parse and understand it (like the https://www.webmapping.cyou/WebToOSMOH/ which you mention, which have been shown (see thread linked) to be woefully inadequate, even in very simple cases (much less complex ones!). Kudos the programmer, but it is incredibly complex task in itself. There is literally one correct answer in millions of possible combinations.
  • then you have to program how to map that knowledge from previous step to opening_hours format (which might turn out to be impossible anyway, even in very simple cases like "even dates 08:00-15:00, odd dates 16:00-20:00")

But, if someone wants to try doing it, by all means, go for it! It would be quite interesting to see the results and limitations, even if it turns out to be too unreliable for actually making inputting the data easier.

@Zverik
Copy link
Owner

Zverik commented Jan 27, 2023

Thank you for the suggestion, and thanks Matija for explanations. I'm against using NLP for a simple reason that it would be much, much slower than what we have now. Instead of pressing 6-10 buttons on screen, you would need to:

  1. Launch phone camera (which also drains the battery)
  2. Make a good, discernible photo (provided opening hours are often printed on transparent doors with in-door dim lighting, or in the sun, which is also bad)
  3. Feed the photo to the parser (which also would have trouble with multiple languages and so on, but let's say it works most of the time)
  4. Edit the result, because it will be imperfect most of the time (off-by-one errors, missing letters and such).

As we know from editing geometry in OSM, changing something that's there is much harder and takes more time than drawing something from scratch.

Each of these steps would take much more time that I would like. The goal is to spend at most 20 seconds on each POI, and toying with the parser definitely would take more than a minute just for opening hours.

@Zverik Zverik closed this as not planned Won't fix, can't repro, duplicate, stale Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants