CREATING A POS-TAG QUERY
Exercise 1: Create a POS-tag-based query for the split infinitive with one adverb between the infinitive marker to and the bare infinitive of a verb (to+adverb+infinitive).
- Open Query Builder – the option used for queries consisting of a combination of POS-tags and/or words. You can do this by either clicking on the Query Builder icon on the top menu bar or by selecting New Query → Query Builder from the File Menu.
- The right part of the query builder marked in red is the area where a new query is created. Each node contains one query element; the vertical order represents the sequence of elements in the query; the horizontal arrangement represents optional elements.
- Click on the empty node, go to Edit → Addkey and enter the Additional key query.
- Under Key, choose c5 for the full set of POS-tags (pos stands for simplified word-class annotation which is not suitable for our query).
-
Tick the box in front of Any and click the Refresh button to see the whole list of POS-tags.
- Find the POS-tag for “infinitive marker to” (TO0) in the list (you may also use the BNC tagset to search for relevant POS-tags for your query in advance at http://www.natcorp.ox.ac.uk/docs/c5spec.html)
- Click the OK - button to enter the POS-tag in the query
- Click on the vertical line below the first node to enter the second node for the next element (for our query it is an adverb). Repeat the procedure of selecting and entering a POS-tag for “general adverb” (AV0).
- Add the third vertical node and enter the POS-tag for “infinitive form of lexical verb” (VVI)
- In order to search for these three POS-tags placed one after another, define the type of link between the query elements: click on the arrow linking the first two nodes, select Link Type and then Next. Repeat this procedure for the link between the second and the third node.
- The query is now ready. Click OK to run the query.
The first result window, entitled Too many solutions, shows the overall number of retrieved hits (4115) and offers the possibility of downloading 100 either initial or random hits, or all the retrieved instances.
Exercise 2: Think of other modifiers that can appear between “to” and the infinitive. Create queries to check if your assumptions are relevant.
You may check the following options: split infinitive with ordinal numeral (ORD), wh-adverb (AVQ), negative particle not (XX0).
Exercise 3: Make a query for the split infinitive with the option of either an adverb or the negative particle not placed in between the infinitive marker and the bare infinitive:
- Define the structure you need to enter: to+adverb/not +infinitive
- In the Query Builder, enter the POS-tag “TO0” for the “infinitive marker to” in the first node (in the same way as in the previous exercise)
- Enter the POS-tag “AV0” for “general adverb” in the second node
- Click on the horizontal arrow link to the right side of the second node and enter the POS-tag for the “negative particle” not (XX0) (optional variant for “AV0” tag)
- Add one more node in vertical sequence below the “adverb node” and enter the POS-tag “VVI” for “the infinitive form of lexical verbs“.
- Define links between the vertical nodes in the same way as in the previous exercise.
- Run the query.
- Create the same query, entering “not” as a word.
- Repeat the same sequence of actions until creating a new horizontal optional node to the right of the second “general adverb” node.
- Click on this node and choose Word → Edit.
- In the upper window enter “not” and click Lookup. The program will show a list of the possible variants of your query in the window below.
- From this list, select the first variant of “not” as a single word and click Query. The selected element will be added to the node in the query builder.
- Add the rest of the query elements needed (as in the previous exercises), define the link type and run the query.
INVESTIGATING POS-TAGGING ERRORS
Exercise 4: Make a POS-tag-query for “to more than double”.
- Using the BNC tagset at http://www.natcorp.ox.ac.uk/docs/c5spec.html or the list of POS-tags in the query builder, try to define probable POS-tags for all the elements of the phrase. For example, how can more be tagged? As a general adverb (AV0) or as another part of speech?
- Create a query for a combination of words with assigned POS-tags: to as an infinitive particle followed by more as, for example, a general adverb, followed by than as a conjunction and double as the infinitive of a lexical verb.
- Open the Query builder and create the query in the following way:
- Click on the empty node and go to Edit → Addkey to enter Additional key query
- Type to in the upper box and click Refresh. A list of all the tags assigned to to will appear in the box below. Select the tag “TO0” for “infinitive marker to” and click OK. To as TO0 will appear in the first query node.
- Repeating the same procedure, enter more as a “general adverb” (“AV0”) in the second node, than as a “subordinating conjunction” (“CJS”) in the third node, and double as “the infinitive form of a lexical verb” (“VVI”) in the last node.
- Define links between the nodes and run the query. Does it retrieve the construction needed?
- You can try to create another query selecting other assigned POS-tags or selecting more than one tag (with the help of “<CTRL>+click”) for each word of the phrase. But it is likely that this task will be time-consuming as all the possible combinations of POS-tags need to be checked. While it is not a big problem to check variants of POS-tags for more and than, double has more than 10 variants of POS-tags assigned.
- To check how the elements of the expression to more than double are tagged, run a simple phrase query to more than double, select Page mode and then XML instead of Plain mode in order to see the corpus annotation. Can you find any reasons for such tagging?
RESTRICTING THE QUERY TO SELECTED PORTIONS OF BNC
Exercise 5: Compare the frequency of the split infinitive in written and spoken parts of the corpus by means of separate queries
- To restrict your search to a particular section of the corpus, you can use the Partitions function. Before creating a query, activate this function: Go to View → Toolbars → Partitions. Three partition boxes will appear on the left side of the main toolbar.
- From the drop-down list in the first partition box choose Text mode. From the drop-down menu in the next partition box choose Written.
- Create your query for the split infinitive using the Query builder.
- Run the query. The server will find 2982 solutions in 1119 texts.
- In the query result window click the Analysis icon on the top menu to check the frequency per million words.
Exercise 6: Compare the frequency of the split infinitive in the academic writing and spoken demographic sections
Variant 1.
- Enable the Partitions function: Go to View → Toolbars → Partitions.
- From the drop-down list in the first Partition box choose Text class. From the drop-down menu in the next partition box choose Academic prose.
- Create your query for the split infinitive using Query builder.
- Run the query. Using the Analysis option, check the normalized frequency of the split infinitive in academic writing.
- Repeat the same procedure for the spoken demographic section.
- Compare the results.
Variant 2.
- Run the query for split infinitives without any partition.
- Download all the instances.
- Click on the Analysis icon on the tool bar to open the window with statistic options.
- Select the category you wish to use as the basis for the analysis from the Partition drop-down box.
Select the type of chart at the bottom of the Analysis window: Pie chart or Bar chart. It will show the distribution of hits according to text mode and type, register or spoken context, sex of the author or of the audience, calculating frequencies and presenting the distribution in a pie or bar chart.
Created with the Personal Edition of HelpNDoc: Free Kindle producer