The Parsed Historical Corpus of the Welsh Language

The Parsed Historical Corpus of the Welsh Language (PARSHCWL) is a project to create an annotated corpus of Middle and Early Modern Welsh texts. The texts in various formats (plain text files, Part-of-Speech tagged and parsed files) will be made available in the course of the project on this website. In addition, detailed annotation manuals and guidelines will be made available here to enable any researcher working with Welsh (historical) texts to add morphosyntactic information to their texts, adding to a growing corpus of searchable historical Welsh materials.

If you're interested in using the corpora for your research, we have a lay introduction, or a more technical introduction for researchers in similar areas about how the corpus was compiled and annotated.


New texts will be added in the Texts section of this website in course of the project.

The first texts that will appear here are the Middle Welsh tales of the Mabinogi and the fourteenth-century texts from Llyfr yr Ancr with corrected annotations by Elena Parina and Raphael Sackmann.

If you use our corpus, please cite us.