6. Closing remarks

Previous Next

Statistics are vital for understanding the quantitative structure of a linguistics data set. R as a well-known programming language offers several advantages for this task: speed, control over the settings, no costs, universal features, and enormous visualization qualities. Apart from explaining the importance of graphical analysis, R basics, and the different classifications of variables, in the main part of this companion website, I demonstrated how to use R for the graphical visualization of data depending on the number of variables and their scale type. Subsequently, I gave an evaluation of the books used for this website.

In addition to all the basic functions presented in this website there are many arguments helping to customize graphs. For example, headings and labels might need to be redefined, colors set, ranges changed and a grid added. Or one might want to bring several visualizations into one display, for example 6 bar plots in one illustration. Finally, at the end of a project the graphical data representations need to be extracted from R.

For familiarizing oneself with all this and more, it is necessary to continue reading about R and also about statistics in general in order to gain more background knowledge. With the help of this website and some of the recommended books, I hope the readers will take the next step from studying the basics of R graphics to employing R for their own, real research questions. However, books are sometimes not enough, and speaking from my own experience I can only encourage every student of linguistics to actively work with the program while going through the steps in the book. Also, ask professional statisticians for help in case there is a problem.

For beginners in particular, graphical data exploration is worthwhile because it facilitates the recognition of statistical patterns in the data. Graphics are helpful when it comes to making patterns visible that otherwise would have been overlooked, for making first guesses and forming hypotheses. However, assessing data just by looking at its visualizations but without consulting any statistical test easily leads to unfounded generalizations and should be avoided. In principle, any interpretation of data is nothing more than a guess, but statistical tests at least give us information about how reliable these guesses are. Having incorporated the corresponding tests in ones work and discovering the results of the hypotheses are (highly) significant, graphical visualizations can be used again to present and discuss the findings. Gries put it in the right words: “[T]here is no meaning in corpora, (…) and it is up to the researcher to interpret frequencies of occurrence and co-occurrence in meaningful or functional terms” (Gries 2009a: 11).

Created with the Personal Edition of HelpNDoc: Easily create iPhone documentation