- Status: Proposed
- Type: Specific
- Work Package: WP6
- Research Coordinators:: Nicoline van der Sijs
- Coordinators for CLARIAH: Katrien Depuydt, Jesse de Does
- Participating Institutes: INT, HuC
- End-users: Historical linguists, other humanities researchers
- Developers: INT team consisting of Mathieu Fannee, Henk van der Pol, Katrien Depuydt, Jesse de Does; HuC team led by Hennie Brugman
- Interest Groups: Text
- Task IDs: Wp3/Wp6 Workflow for Digitization and Conversion
Develop a suitable workflow for Wp6 use case 2: digitization of historical newspapers (corpus curation part)
- Data have been transcribed and uploaded in relational database, but metadata and segmentation require extensive curation
- Data have to be converted to a format suitable for corpus exploitation
- Curation environment (based on Lex'it platform)
- KB 17th century newspaper corpus
- Lex'it
- Conversion database -> suitable XML format
- Historical linguistic annotation tools
- Blacklab
(if known, what existing software and services are involved, which need to be developed? Please link to the tools if possible and specify whether it can be used as is, needs extra work, needs to be developed from scratch etc.)
References to related resources and publications and especially links to related use-cases: