WP6 use case VOC

Metadata

Status: In Progres
Type: Specific
Work Package: WP6
Research Coordinators: Lodewijk Petram
Coordinators for CLARIAH: Katrien Depuydt, Jesse de Does
Participating Institutes: HuC, VU, KB, INT, DANS
End-users: Linguists, historians
Developers: Sophie Arnoult, Dirk Roorda, Jesse de Does, ....
Interest Groups: Text
Task IDs: Wp3/6 task Infrastructure for Historical Dutch

Description

In the CLARIAH+ WP6 meeting on Tuesday 16 April 2019, it was decided to further develop an idea for a joint use case. The aim of this use case is to align the tools and methods of the different partners in WP6. The VOC was chosen as the topic, on the one hand because of the rich and versatile source material available about this company and its activities, and on the other hand because of the challenging and relevant historical research questions that can be answered with this use case.

What is the research about?

How can CLARIAH text processing tools contribute to historical research questions like e.g.

What shifts took place in the VOC's presence in the East Indies and in the Company's interaction with local rulers and their subjects (1600-1800)?
How did the networks of VOC employees in Asia develop?
How did the way in which official VOC documents were written about the local East Indies population, and about the interaction between the VOC employees and the local population develop?
How did the way in which secondary literature wrote about the VOC's presence in the East Indies, the local East Indies population and the interaction between the VOC employees and the local population develop?
How did the way in which newspapers, popular magazines and pamphlets were written about the VOC's presence in the East Indies, the local East Indies population and the interaction between the VOC personnel and the local population develop?

What problem is hindering the research?

Suitable tools were lacking for historical text processing, a.o.

For named entity recognition and resolution
Basic linguistic annotation (lemma, PoS)

What is needed to do the research?

Train better NER
Apply state-of-the art tools for historical enrichment mediated by Wp3/6 task Infrastructure for Historical Dutch, which may include tools like PIE and Deepfrog

Data

Among others:

Generale Missiven (TEI data converted from ABBYY XML, also NAF and Text fabric representations)
Pieter van Dam, Beschryvinge van de Oostindische Compagnie
Dagh-register gehouden int Casteel Batavia vant passerende daer ter plaetse als over geheel Nederlandts-India (1624-1682, uitgegeven in periode 1887-1931)
De dagregisters van het kasteel Zeelandia, Taiwan (1629-1662, uitgegeven in periode 1986-2000)
several relevant books, newpaper articles and periodicals
a range of available relevant structural data Sources

Tools

Currently deployed:

Conversion pipeline Abbyy XML --> TEI --> NAF/XMI (Sophie Arnoult and Jesse de Does)
Named entity recognition (developed by Sophie Arnoult)
Text fabric analysis tools
Others to be determined by lead researchers

What software and services are involved?

(if known, what existing software and services are involved, which need to be developed? Please link to the tools if possible and specify whether it can be used as is, needs extra work, needs to be developed from scratch etc.)

How to evaluate this?

Evaluation by researcher.

References

References to related resources and publications and especially links to related use-cases:

CLARIAH

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wp6-use-case-1-voc.md

wp6-use-case-1-voc.md

WP6 use case VOC

Metadata

Description

What is the research about?

What problem is hindering the research?

What is needed to do the research?

Data

Tools

What software and services are involved?

How to evaluate this?

References

Files

wp6-use-case-1-voc.md

Latest commit

History

wp6-use-case-1-voc.md

File metadata and controls

WP6 use case VOC

Metadata

Description

What is the research about?

What problem is hindering the research?

What is needed to do the research?

Data

Tools

What software and services are involved?

How to evaluate this?

References