Fixed bugs
- hocr-split: Duplicate content in
<html>
#58
- hocr-pdf:
ocr_line
does not have to be a span
(e.g. also a div
is possible) #57
- hocr-check: Fix containment checks and metadata checks, add tests #52 #61 #62
Ongoing work
- Check handling of non ASCII characters in hOCR files #53
- Make hocr-tools fit for Python 3 #37
See details: v1.0.0...v1.0.1