3.1.5 A WebCorp example search

Parent Previous Next


The aim of this example search was to find out the frequencies for the usage of lest with the subjunctive, the indicative and a modal periphrasis in different varieties of English. The four varieties under investigation were British, American, New Zealand and Australian English. WebCorp’s advanced search was used for the query as it allows the user to specify the site domain for the search. The settings for the advanced search were the following: Google was used as search engine, the case option was case sensitive and the number of pages to retrieve was set to 500. The domains were .uk for British English, .us for American English, .nz for New Zealand English and .au for Australian English. The domain .us unfortunately limits the number of hits for American English, since domains like .org or .mil, for example, also include American English. This is, however, a general problem of the Internet and cannot be solved satisfyingly.

The initial search gave us the total numbers of concordance entries for the respective search terms. The results are illustrated in Table 5.



These numbers can be used to calculate the percentage distribution of the three items in each variety (cf. Table 6).



The table shows that American and New Zealand English both clearly prefer the use of the subjunctive after lest to the use of the indicative or a modal periphrasis. These numbers differ from the results of a search with Google, which is most likely due to the selection process of WebCorp and the comparatively small set of hits.

Since WebCorp has to rely on commercial crawlers, it comes with similar problems. It is practically impossible to receive the same search result twice. The results often differ from one day to another or even from one hour to the next. Moreover, there are duplicate documents and it is therefore necessary to post-edit the results manually.

In this example, only the findings for British English were post-edited, which in itself was very time-consuming. The final results are listed in Table 7.



The reduced number of hits results from discarding several duplicate documents and the fact that some sites either could not be accessed or produced errors. However, the distribution does not differ much from that in Table 6.

After seeing an example search conducted with Google, the question that arises is whether the results change when a different search engine is used? Table 8 addresses this question. It shows the results of a search conducted with the same parameters as the first example search, but only for British English within the .uk domain. In addition, all six engines were used one after the other to compare the results.



Although the numbers of hits retrieved with the different search engines differ greatly, the general tendency stays more or less the same, as can be seen from the percentages given in Table 9. The only deviant value is the distribution found when using Metacrawler.




Created with the Personal Edition of HelpNDoc: Easily create PDF Help documents