For links to official websites of the corpora and software quoted in the chapters see Catalogue of corpora and software.
FREE ONLINE CORPORA
- The American National Corpus (ANC) project is currently creating a collection of American English from 1990 onwards. The goal is to collect a corpus comparable to the BNC. There is an "open" portion of the full ANC consisting of approximately 15 million words, which is freely available for download.
http://www.americannationalcorpus.org/
- The British Academic Spoken English (BASE) corpus consists of 160 lectures and 39 seminars recorded in a variety of university departments.
http://www.coventry.ac.uk/researchnet/d/503
- The British Academic Written English (BAWE) corpus contains just under 3000 good-standard student assignments (6,506,995 words).
http://www.coventry.ac.uk/researchnet/d/911
- Scottish Corpus of Texts and Speech (SCOTS) contains 4 million words of text in Scottish English
http://www.scottishcorpus.ac.uk/
- The Collins WordbanksOnline English Corpussampler is composed of 56 million words of contemporary written and spoken text
http://www.collins.co.uk/Corpus/CorpusSearch.aspx
- The Michigan Corpus of Academic Spoken English (MICASE) is a collection of nearly 1.8 million words of transcribed speech available online
http://micase.elicorpora.info/
- A collection of online corpora from Brigham Young University contains seven corpora: Corpus of Contemporary American English (COCA), Corpus of Historical American English (COHA), BYU-BNC: British National Corpus, TIME Corpus of American English, Corpus del Español and Corpus do Português.
http://corpus.byu.edu/
- The Leeds corpora website offers a collection of English corpora (BNC, BROWN corpus, Reuters, British News) as well as a free search tool with concordance and collocation options
http://corpus.leeds.ac.uk/protected/
ORGANIZATIONS
- International Computer Archive of Modern and Medieval English (ICAME) an international organization of linguists and information scientists working with English machine-readable texts. Th organization’s websitcontains the ICAME collection of corpora, a catalogue of manuals, the ICAME journal and a small sample of the Brown corpus.
http://icame.uib.no/
ICAME Corpus Manuals: http://129.177.24.52/icame/manuals/
Introduction to BNCweb
http://corpora.lancs.ac.uk/BNCweb/home.html#features
- University Centre for Computer Corpus Research on Language (UCREL)
http://ucrel.lancs.ac.uk/
http://ucrel.lancs.ac.uk/claws/ - CLAWS part-of-speech tagger for English
http://ucrel.lancs.ac.uk/annotation.html#POS - introduction to POS-tagging
http://ucrel.lancs.ac.uk/corpora.html - a list of corpora
PERSONAL AND OTHER WEBSITES
- The Corpus Resource Database (CoRD)is an open-access online resource at The Research Unit for Variation, Contacts and Change in English(VARIENG) in Finland. It collects basic information about corpora provided by the compilers, thorough descriptions of 28 provides the user with aCorpus Finder tool.
http://www.helsinki.fi/varieng/CoRD/index.html
- The Centre for English Corpus Linguistics Université catholique de Louvain (Belgium) provides useful references on corpora, corpus tools and bibliography.
http://juppiter.fltr.ucl.ac.be/FLTR/GERM/ETAN/CECL/references.html
- Corpus Linguistics by Tony McEnery and Andrew Wilson - this website was created as a companion web page to the authors' book “Corpus Linguistics” (see Further reading) and supplements the first chapters of the book. It is easy to read and very useful with simple navigation and much valuable information.
http://www.lancs.ac.uk/fss/courses/ling/corpus/
- The website athel.com provides ample information on corpora, corpus software, and additional resources.
http://www.athel.com/corpus.html
- Corpora4Learning.net is a website which provides links and references for the use of corpora, corpus linguistics and corpus analysis in the context of language learning and teaching, but the collections of English corpora, software and bibliography can be used by everyone who is interested in English corpus linguistics.
http://www.corpora4learning.net/
A website intended to become an annotated guide to relevant resources available online, contains sections on corpora, software and a bibliography.
- Collection of Corpus and Concordance tools
http://courses.washington.edu/englhtml/engl560/corplingresources.htm#tools
- Bookmarks for corpus-based linguists:
catalogue of free concordancers, search engines, text-analysis tools
http://personal.cityu.edu.hk/~davidlee/devotedtocorpora/software.htm
- An introduction to the BNC and COCA corpora with exercises
http://kielikompassi.jyu.fi/kookit06/corpus/view/bncbasics1.html
- Numerous links to corpus linguistics and related websites: corpora, software, magazines, tutorials, glossaries, bibliographies, etc.
http://www.staff.amu.edu.pl/~przemka/corplink.html
Created with the Personal Edition of HelpNDoc: Free Web Help generator