3.1.2.4 Exercises for BNCweb

Parent Previous Next




CATEGORIZING THE TARGET STRUCTURE (SPLIT INFINITIVE)


At first create a POS-tag-based query which is likely to retrieve many instances of the split infinitive, such as "_TO0 _AV0 _VVI" (to+adverb+infinitive).

Then enter the search pattern into the query window and click Start Query.



Now imagine you want to categorize each hit according to its validity in terms of illustrating an instance of a split infinitive. Obviously, the task of examining each of the 4097 hits would be extremely painstaking and time consuming, which is why - for the purpose of this exercise - we will focus on only the first 50 hits. On a side note, checking the validity of retrieved instances is of specific importance when trying to create an ideal search pattern which accounts for the precision and recall problem.

In order to define your categories, go to the window in the right hand corner, where it says New Query and click on the arrow facing to the bottom. You will be presented with a number of post-query options. Choose Categorize hits... from the list and press Go!.



Now, define the category-set and the respective categories. Be sure to read the instructions below.

Then click Submit.



Your BNCweb Query result window should now display a further column next to your results entitled Categories. This feature will enable you to annotate your results according to your predefined categories. If you are uncertain as to which category a particular linguistic feature should be attributed to or if it simply falls under a category which you have not determined, BNCweb offers the option of designating retrieved results as either unclear or other.

Now, check the first 50 hits, determine whether or not they represent instances of split infinitives and label them accordingly.



After having checked the first 50 retrieved hits your results should be the following:

Out of the first 50 hits, 49 exhibit instances of split infinitives and 1 does not.

# 47, Filename: A1U 185 "But one cannot have it both ways: for UK wastes (the import of wastes for landfill is now outlawed) the options are landfill or incineration, and people local to either hate the one near them."

BNCweb has tagged "to either hate" as an instance of an infinitive marker followed by an adverb and the bare infinitive form of a verb i.e. a split infinitive. A closer look, however, will show that "to" is, in fact used as a preposition, "either" functions as a pronoun referring to "landfill or incineration" and "hate" is a verb which belongs to the 3rd person plural subject "people local to either". --> "people local to either (S) hate (V) the one near them (O)". Therefore, the retrieved structure # 47 does not exhibit an instance of our target structure and thus needs to be labeled "no_split_infinitive"



Finally, go to Save categorization values for this page! and click Go! If, at a later stage, you want to edit your query results, go to Categorized queries under User-specific functions in the standard query window and simply click on the respective query.


Created with the Personal Edition of HelpNDoc: Easy CHM and documentation editor