3.1 Historical reference corpora

Parent Previous Next

3.1.1        The Helsinki Corpus of English Texts (HC)

3.1.2        Penn Parsed Corpora of Historical English

3.1.3        A Representative Corpus of Historical English Registers (ARCHER)


The website http://icame.uib.no/hc/ offers a “Manual to the Diachronic Part of the Helsinki Corpus of English Texts”, written by Merja Kytö from the University of Helsinki in 1996. In the preface, the reader can learn some basic details about the project, such as who was involved in it, which period the corpus covers and what purposes it has. The guide itself is supposed to help researchers working with the diachronic part of the HC in terms of coding conventions. However, it also presents a dense overview of the structure of the corpus and gives information on the periodization, the number of words in each period, the texts and genres used and their file names. In addition to that, the author explains the usage of the HC with the concordancing programs “Oxford Concordance Program (OCP)” and “WordCruncher”.


The websites for the Penn Parsed Corpus of Modern British English (PPCMBE), the Penn-Helsinki Parsed Corpus of Middle English (PPCME2) and the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) are structured identically. When visiting the website, visitors are first presented with a form for ordering copies of the three corpora. A description of the particular corpus follows, giving information about time spans covered, periodization, genres and number of words. In addition to clarifications that the guide to the HC offers as well, like those about file names and general coding conventions, the website lists the number of words and additional heading information like the author’s birthdate for each text individually. This can be found under the heading  "Philological information". A larger section then serves as a manual for the use of annotations like POS-tags and syntactic parsing.


The website of the School of Languages, Linguistics and Cultures of the University of Manchester offers an overview of the ARCHER project. It comprises general information about the consortium of participants and different versions of ARCHER and gives an insight into the periodization of the corpus as well as into questions of filenames, genres and word counts. Furthermore, the website contains lists of publications on ARCHER as well as recent publications using ARCHER, among which the present book chapter is listed, too.

Created with the Personal Edition of HelpNDoc: Free CHM Help documentation generator