You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, first thank you very much for your work on hocr! I'm part of an organization that gets hocr from Google Books and I'm quite new to the specification. Something that caught my eye is the reference to ISO639-1 for language codes. Since it doesn't contain all language codes, I think referring to BCP47 is more generic and future-proof. What do you think? It's a retro-compatible change since ISO639-1 tags are BCP47 compliant (at least in a first approximation)
The text was updated successfully, but these errors were encountered:
I don't feel strongly either way, but it might be a good opportunity to align with how ALTO and PAGE handle language/script.
In ALTO we decided on using what xsd:language expects, i.e. RFC 1766, which in turn references ISO639-1. IIUC this might not be expressive enough for your puproses?
My understanding of the latest XSD spec is that it requires BCP47 lang tags, the 1.0 spec indeed refers to RFC1766. I don't think there might be any reason why RFC1766 should be recommended instead of BCP47, but perhaps there are some?
Hello, first thank you very much for your work on hocr! I'm part of an organization that gets hocr from Google Books and I'm quite new to the specification. Something that caught my eye is the reference to ISO639-1 for language codes. Since it doesn't contain all language codes, I think referring to BCP47 is more generic and future-proof. What do you think? It's a retro-compatible change since ISO639-1 tags are BCP47 compliant (at least in a first approximation)
The text was updated successfully, but these errors were encountered: