7. Glossary

Previous Next


In order to know how to implement the four methods described on this site, you need to have some basic statistical knowledge. A few fundamental terms are explained below:


Mode:

Xmod := The most frequently occurring number in a list is called the mode (1,2,2,3,3,3,4) à Xmod= 3; the mode is not necessarily well defined, the list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3


Median:

xmed:= The median represents the numerical value which separates, for instance, a sample's or a popultation's, etc. lower half from its higher half. Take for example the following finite list of numbers: 1,2,3,4,5. Here, 3 is the median as it separates 1 and 2 from 4 and 5. If, however, there are two middle values, one must calculate the average mean in order to ascertain the median. The following list's median is therefore 3.5: 1,2,3,4,5,6.


Average:

Often referred to as the arithmetic mean; Implementation: Add up all occurrences of each individual and divide this total number by the number of informants (à cf. Pooling)


Point Estimate:

Calculation of a single value to predict a population (resp. sample) frequency; goal is to form a “best guess”


…with replacement:

If a (bootstrap) sample is drawn from an overall population with replacement, it means that one individual can occur several times in the sample.


Confidence Interval:

A confidence interval is an interval estimate of a population parameter. Instead of estimating the parameter by a single value, as done when computing a point estimate, an interval which is likely to include the parameter is given. How likely the interval is to contain the parameter is determined by the confidence level (1-α). Most of the time the confidence level is either 90%, 95%, or 99%. Thus, one can speak of a 95% confidence interval.

Ex.: A 90% confidence interval of 35-45 means that with a large number of repeated samples, 90% of the calculated confidence intervals would include the true (but unknown) value of the parameter. The probability that the parameter is inside the given interval (say, 35-45) is either 0 or 1 (the non-random unknown parameter is either there or not). It can be summarized that a confidence interval is one interval generated by a procedure that will give correct intervals 90 % [resp. 95 %, 99%] of the time.


Posterior Interval: (also known as credible interval):

Posterior intervals are used in Bayesian statistics (in contrast to frequentist statistics) and fulfill a purpose similar to the one of the confidence intervals. A statement such as "following the experiment, a 90% credible interval for the parameter t is 35-45" means that the posterior probability that t lies in the interval from 35 to 45 is 0.9.

Bayesian statistics use prior knowledge of a situation to form estimates. For example, famous statistician Pierre-Simon Laplace combined the available astronomical data to provide an estimate (and uncertainty) on the mass of Saturn. He stated "...it is a bet of 11,000 to 1 that the error in this result is not 1/100th of its value". (The modern estimate differs from Laplace's by 0.63%)





Created with the Personal Edition of HelpNDoc: Easily create iPhone documentation