Skip to content

Latest commit

 

History

History
54 lines (33 loc) · 1.67 KB

curation-historical-newspapers.md

File metadata and controls

54 lines (33 loc) · 1.67 KB

Curation of transcribed historical newspaper corpus (Wp6 Use case 2)

Metadata

  • Status: Proposed
  • Type: Specific
  • Work Package: WP6
  • Research Coordinators:: Nicoline van der Sijs
  • Coordinators for CLARIAH: Katrien Depuydt, Jesse de Does
  • Participating Institutes: INT, HuC
  • End-users: Historical linguists, other humanities researchers
  • Developers: INT team consisting of Mathieu Fannee, Henk van der Pol, Katrien Depuydt, Jesse de Does; HuC team led by Hennie Brugman
  • Interest Groups: Text
  • Task IDs: Wp3/Wp6 Workflow for Digitization and Conversion

Description

Develop a suitable workflow for Wp6 use case 2: digitization of historical newspapers (corpus curation part)

What is the research about?

What problem is hindering the research?

  • Data have been transcribed and uploaded in relational database, but metadata and segmentation require extensive curation
  • Data have to be converted to a format suitable for corpus exploitation

What is needed to do the research?

  • Curation environment (based on Lex'it platform)

Data

  • KB 17th century newspaper corpus

Tools

  • Lex'it
  • Conversion database -> suitable XML format
  • Historical linguistic annotation tools
  • Blacklab

What software and services are involved?

(if known, what existing software and services are involved, which need to be developed? Please link to the tools if possible and specify whether it can be used as is, needs extra work, needs to be developed from scratch etc.)

How to evaluate this?

References

References to related resources and publications and especially links to related use-cases: