3 Glossary

Previous Next

Aggregate data perspective. Aggregate data analysis […] is concerned […] with the joint analysis of multiple characteristics. […] [A]ggregate analysis is appropriate whenever the analyst's attention is turned to the wood, not the trees; this is […] the AGGREGATE PERSPECTIVE. Woods, along these lines, may be languages, regional language varieties, stylistic language varieties, or any other multidimensional object.


Feature-centered perspective. Feature-centered studies are concerned with the “distribution of individual features, properties, or measurements”. Feature-cantered studies are appropriate whenever the analyst's attention is turned to the trees, not the woods; “If it is the individual trees (i.e. linguistic phenomena) that matter, the FEATURE-CENTERED PERSPECTIVE is called for.”


Overt Grammatical Analyticity. Grammatical information is conveyed by free grammatical markers such as determiners (e.g. who), pronouns (e.g. he), prepositions (e.g. in), conjunctions (e.g. and), infinitive markers (e.g. to), primary verbs (be, have, do), modal verbs (e.g. can), and negators (e.g. not). All these markers belong to a closed-class and do not carry any lexical meaning (they are function words) (cf. Kortmann and Szmrecsanyi 2011: 280).

For more information on this concept please visit http://philpapers.org/browse/the-analytic-synthetic-distinction.


Overt Grammatical Syntheticity. Grammatical information is conveyed by bound grammatical markers such as “verbal, nominal, and adjectival inflectional affixes (eg. Past tense –ed, plural –s, comparative –er, and so on), the Saxon genitive (e.g. Tom’s house) as a clitic, as well as allomorphies such as ablaut phenomena (e.g. past tense sang), i-mutation (e.g. plural men), and other non-regular yet clearly bound grammatical markers” (Kortmann and Szmrecsanyi 2011: 280).

For more information on this concept please visit http://philpapers.org/browse/the-analytic-synthetic-distinction.


Euclidean Distance Measure/As-The-Crow-Flys-Distance. The Euclidean metric (or Pythagorean distance, as-the-crow-flies distance, beeline distance) dE is the metric on Rn, defined by

.

It is the ordinary distance between two points that one would measure with a ruler, and is given by the Pythagorean formula (cf. Deza and Deza 2009: 94).

For a discussion on different distance measures concerning Time Series Analysis (cf. http://www.statsoft.com/textbook/time-series-analysis/) got to http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.58.5139&rep=rep1&type=pdf.


Least-Cost-Travel Time. Least-cost-travel time denotes the cheapest route between two places.


Linguistic Gravity (Trudgill’s Gravity Model). The term “Linguistic Gravity” is strongly related to Trudgill’s Gravity Model. According to this model linguistic innovations spread from a larger population center to a smaller population center and so on. The gravity model distinguishes itself from the so-called wave theory. Where the latter postulates that linguistic innovations radiate from a center, the former assumes that they start at larger centers and then move their way towards ones that continuously diminish in size (cf. Nerbonne, van Gemert and Heeringa 2005: 2f.).

For more information visit http://www.let.rug.nl/nerbonne/papers/gravity2004.pdf or see Trudgill, Peter 1974. "Linguistic Change and Diffusion. Description and Explanation in Sociolinguistic Dialect Geography", Language in Society 2: 215-246.


English-based pidgin languages, creole languages. “[…] creole and pidgins are new language varieties which developed out of contacts between colonial non-standard varieties of a European language and several non-European languages around the Atlantic and in the Indian and Pacific Oceans during the sixteenth to nineteenth centuries. Pidgins typically emerged in trade colonies which developed around trade forts or along trade routes, such as on the coast of West Africa. They are reduced in structures and specialized in functions (typically trade), and initially served as non-native lingua francas to users who preserved heir native vernaculars for their day-to-day interactions.” (Mufwene 2008: 544)

For a compact overview over creole languages and the ways of acquiring them see Wekker, Herman 1996. Creole languages and language acquisition. Berlin:Mouton de Gruyter.  

For more information on the history of creole languages see Byrne, Francis 1991. Development and structures of Creole languages. Essays in honor of Derek Bickerton. Amsterdam: John Benjamins.


Multidimensional Scaling. “MDS is a data visualisation technique for exploring dissimilarities in data. The basic algorithm starts with a symmetric matrix of dissimilarities between items. It assigns a location to each item in a space of an appropriate dimension. The procedure finds a monotonic relationship between the items in the matrix and the Euclidean distance between them. The relationship is typically found using isotonic regression.” (Cox 2009: 384)

For an explanation of the most important terms and ideas behind MDS please visit http://faculty.chass.ncsu.edu/garson/PA765/mds.htm.


Perl. Perl is an all-purpose programming language. Its “process, file, and text manipulation facilities make it particularly well-suited for tasks involving quick prototyping, system utilities, software tools, system management tasks, database access, graphical programming, networking, and web programming” (http://learn.perl.org/faq/perlfaq1.html).

For a free online introduction go to http://www.cs.cmu.edu/cgi-bin/perl-man.

For a Linguistically interesting and thoroughly written introduction see Nugues, Pierre M. 2006. An Introduction to Language Processing with Perl and Prolog. An Outline of Theories, Implementation, and Application with Special Consideration of English, French, and German. Heidelberg and Berlin: Springer. http://www.springerlink.com/content/m34655/.

For a more general introduction to the subject, which specifically concentrates on the programming language itself see Wainwright, Peter 2005. Pro Perl. A comprehensive guide for developers who want to master the Perl programming language. New York: Springer. http://www.springerlink.com/content/nwj323/.


SPSS. SPSS stands for Statistical Package for Social Sciences. This multi-usage statistical package was originally designed to process data in the scope of social science studies, but, today, is used in a wide variety of areas. It basically performs complex statistical calculations and is used for survey authoring and deployment, data mining, text analytics, statistical analysis, and collaboration and deployment.

For a German introduction to data analysis with SPSS go to http://www.home.uni-osnabrueck.de/elsner/Skripte/spss.pdf.

For a more detailed handbook see Janssen, Jürgen and Laatz, Wilfried 2010. Statistische Datenanalyse mit SPSS. Eine anwendungsorientierte Einführung in das Basissystem und das Modul Exakte Tests. Berlin and Heidelberg: Springer.


R. “R is a free, cooperatively developed, open-source implementation of S, a powerful and flexible statistical programming language and computing environment that has become the effective standard among statisticians.” (Fox and Andersen 2005: 1)

The program can be downloaded free of charge at http://www.r-project.org/.


Dialectometry. Dialectometry is a branch of geolinguisitcs. Dialectometrical analysis  focuses on aggregate distances between different dialects, by investigating a large number of dialect features and visualizing the respective dialect distances by means of cartographic visualization techniques, thus projecting similarities to geography (cf. Szmrecsanyi 2009).

For a short abstract by Benedikt Szmrecsanyi that delivers hints for further reading on Corpus-Based Dialectometry please visit http://www.ids-mannheim.de/aktuell/kolloquien/gac2009/abstract/szmrecsanyi.pdf.

Created with the Personal Edition of HelpNDoc: Write eBooks for the Kindle