Note that using this online version may require searching for additional information on the BNC official website http://www.natcorp.ox.ac.uk, as not all the relevant instructions are available on the BYU website, which, at the same time, offers some combined automatic options that might appear confusing.
CREATING A POS-TAG QUERY
Exercise 1: Create a POS-tag-based query for the split infinitive with one adverb between the infinitive marker to and the bare infinitive of a verb (to+adverb+infinitive).
- Go to http://corpus.byu.edu/bnc/
- Open the query syntax guidelines from the introduction section displayed on the right part of the screen in a new tab. There are two options for making a query by using a combination of POS-tags:
Variant 1.
Select tags from a drop-down list (POS LIST) in the appropriate order. Note, however, that this list contains simplified tags only and does not provide the same indications as the standard BNC C5 POS-tag list. For example, there is no POS-tag for “infinitive particle 'to'”, though it is included in the BNC tagset.
Type in “to” as a simple word and then choose the tags “adv.ALL” and “verb.INF”; the following query will automatically be reflected in the search window: “to [av*] [v?i*]”.
Variant 2.
Click on the question mark next to POS LIST and read the instructions on how to enter POS-tags in a query window (POS-tags must be included in square brackets and divided by a blank space)
Go to “Click here for a list of these part of speech tags“ and select the appropriate POS-tags. Align the tags with the word order of the search structure and enter them into the query window. Your query will look like this: “[TO0] [AV0] [VVI]”
- Choose the KWIC mode from the Display menu. It will present the results in a key-word-in-context concordance. (The List option shows the frequency breakdown - the listings of all individual strings that match the query, with a frequency indication for each string. The Chart mode shows the distribution of the target construction in the respective corpus sections. The Compare mode enables the comparison mode.)
- Click Search
Exercise 2: Think of other modifiers that can appear between “to” and the infinitive. Create queries to check if your assumptions are relevant.
You may check the following options: split infinitive with ordinal numeral (ORD), wh-adverb (AVQ), negative particle not (XX0).
Exercise 3: Make a query for the split infinitive with the option of either an adverb or the negative particle not placed in between the infinitive marker to and the bare infinitive of a verb.
According to the query syntax, there is no possibility of searching for such options as “any of two POS-tags” or “a POS-tag or a word”. All the variants of queries based on the BNC C5 tagset and common BNC syntax were either recognized as an error, or retrieved only one option.
INVESTIGATING POS-TAGGING ERRORS
Exercise 4: Make a POS-tag-query for “to more than double”.
- Using the link to access a list of POS tags, try to define POS-tags for all the elements of the phrase. For example, how can more be tagged? As a general adverb (AV0) or as another part of speech?
- Using the POS LIST and instructions on the query syntax, create a query for a combination of words and POS-tags: type “to” as a word, followed by “more” as, for example, a “general adverb” (“AV0”), followed by “than” as a “subordinating conjunction” (“CJS”) and “double” as “the infinitive form of a lexical verb” (“VVI”). In order to search for a word as a certain part of speech, it should be presented as “[word].[POStag]”. The query should look like this: “to [more].[av*] [than].[cj*] [double].[vv*]”
- Run the query. Does it retrieve the construction needed? Unfortunately, the BYU-BNC either does not allow checking how a word is tagged in the corpus, or this function is not easily found. The phrase “to more than double” is tagged in the BNC in the following way: “to_PRP more_DT0 than_CJS double_AV0”. According to the query syntax used in the BYU-BNC, the correct query to retrieve the phrase should be: “to [more].[d*] [than].[cj*] [double].[av*]”.
RESTRICTING THE QUERY TO SELECTED SECTIONS OF THE BNC
Exercise 5: Compare the frequency of the split infinitive in written and spoken parts of the corpus by means of separate queries
- Create the query for the split infinitive.
- Variant 1. Choose the Chart mode from the Display menu – this option will show the distribution and normalized frequency of the retrieved construction for the respective sections. While the Spoken section is presented as an undivided single category, the Written section is presented as consisting of the following sub-categories: “fiction”, “magazine”, “newspaper”, “non-acad”, “academic” and “misc”.
- Variant 2. In the Sections menu, click Show and choose the restriction in the left column: the drop-down menu offers one section for “spoken” texts and several subsections for “written” texts, such as academic, fiction, etc. Choose “Spoken” in the left column, and all the other sections, namely “fiction”, “magazine”, “newspaper”, “non-acad”, “academic” and “misc”, in the right column. Select the List display mode to get the frequency distribution.
- Run your query.
Exercise 6: Compare the frequency of the split infinitive in academic writing and in the spoken demographic sections
- Create the query for the split infinitive.
- Choose the Chart mode from the display menu – this will show the distribution and normalized frequency of the retrieved construction for the respective sections.
- Variant 1. Explore the sub-categories in the Spoken section: since there is no category entitled “spoken demographic”, the “conversation” sub-section can be chosen, if appropriate.
- Variant 2. In the Sections menu click Show and choose the restriction “academic” in the left column to search for split infinitives in academic writing. However, since no category entitled “spoken demographic” is available, according to basic criteria of BNC sampling, “spoken conversation” can be chosen, if appropriate.
- Run your query.
Created with the Personal Edition of HelpNDoc: Easily create Web Help sites