- Status: Proposed (NB: still has to be discussed with relevant researchers)
- Type: Specific
- Work Package: WP3
- Research Coordinators: Time in Translation group
- Coordinators for CLARIAH: Jesse de Does, Vincent Vandeghinste
- Participating Institutes: INT, UU
- End-users: Time in Translation group
- Developers: (Who is involved in implementing this use-case (if any)? Try to mention name, institute, role/responsibility)
- Interest Groups: (a list of CLARIAH interest groups, such as Text and DevOps, for which this use case may be relevant. See the list of IG's at: https://github.com/clariah/ig/.
- Task IDs: Wp3 search engine extensions: parallel corpora; treebanks
Progress in studying verbal tense and aspect semantics can be made by applying quantitative corpus methods in the field of semantic micro-typology, in particular by exploiting the possibilities of translation corpora.
Tense-aspect categories found across languages.
Absence of a flexible, open source and user-friendly environment to explore the corpus data.
We propose extensions to blacklab/blacklab-server/autosearch
- to enable parallel concordancing
- extraction of relevant statistics
- upload of parallel data created by researchers into autosearch
- exploitation of existing parallel corpora
Parallel UD-enriched corpora (tagging, lemmatization, dependency syntax)
- created by researchers
- existing corpora (OPUS, etc)
- extended version of blacklab/autosearch
- Visualization and analysis tools developed by the Time in Translation group
- Is the researched satisfied?
References to related resources and publications and especially links to related use-cases: