3.3.3 Bootstrapping

Parent Previous Next

While many specialized statistics packages offer bootstrapping, it can also be done with Excel. For this purpose, it is necessary to download and install an add-in called Data Analysis Toolpak. After installing, the Tool Pak can be accessed under Excel Data Data Analysis Sampling. It will look like this:

First, there will be a menu where you can select the function you want to use. In this case, it is sampling. Then the menu for sampling or bootstrapping is opened.

First, you have to select a range of cells that you want to have resampled. Then you have to pick random sampling and choose how big a random sample you want to have. Last, you can choose where you want the data to appear. For a detailed explanation on the possibilities of Data Toolpak, see Rochowicz (2010). He offers several techniques for data analysis, especially the application of confidence intervals and inferential statistics as well as a guide for Excel’s statistic functions. His paper can be accessed here: http://epublications.bond.edu.au/cgi/viewcontent.cgi?article=1080&context=ejsie.

For bootstrapping, the frequencies of yours between 1518 and 1600 will be resampled.  A resample size of 54 as in the original table is selected for the following analysis. Bootstrapping in Excel is done with replacement, which means that frequencies or values that already appeared can reappear in the resample. In this case, five resamples are made. The results are the following:


Original sample:

Resample 1

Resample 2

Resample 3

Resample 4

Resample 5

0

0.16

0.16

0.2

0.78

0.78

0

0.33

0

0.43

0.14

1

1

0

0.15

0.46

0

0.17

1

0.14

0.14

0

0.25

0

1

1

0

1

0

0

0.6

0

0.14

0.33

1

0

0.5

0.88

0.18

0.23

0.33

0.18

0.6

011

0.37

1

0.04

0.2

0.25

0.31

0

0.62

1

0.5

0.31

0.33

0

0.33

1

0

0

0

0.33

0

0.5

1

0

1

0.15

0.36

0

0.25

0.33

0.2

0.14

0.6

0.23

1

0

0.4

0.6

0.62

0.14

1

0.33

0.33

0.37

0

1

1

0.36

0

0.5

0.62

0

0.6

0.77

0.35

0.33

0.37

1

0.46

0.62

0.18

0.33

0.46

0

0.78

0.14

0

0.35

0

0.17

0.31

0.37

1

0

0.21

0.56

0.78

0.21

0.5

0.6

0

0.31

0.77

0.4

1

0.2

0.11

0.33

0.37

0.23

0.6

0

0

1

0.33

0

0.35

1

0.11

0

0

0

0

0.14

0.56

0.15

0.11

0

0

0.35

0.17

0

0.88

0.15

0.04

0.78

0

0

0.46

0.31

0.5

0.56

0

0

0

0.17

1

0.43

0.2

0.4

0.21

0.16

0.6

0.18

1

0.56

0.88

0.04

0.46

0.2

0.62

1

0.14

0

0

0.15

0.25

0.2

0.62

0.18

0.2

0.62

0.4

0.04

1

0

0

0.33

0.33

0.33

0.62

0.43

0.2

0.35

0.21

0.77

0.16

1

0

0

0.21

0

0.88

1

0.35

0.35

0

0.11

1

0.88

0.14

0.2

0.21

0.23

0.33

0.21

0

0.18

0

0.62

0

0.14

1

0.37

0.56

0

037

0.78

0.5

1

0

0.14

0.09

0.09

1

0.15

0

0.21

0.31

0.56

0.15

0

0.15

0.31

0.33

0.62

0.46

0.35

1

0

0.62

0

0.78

0.16

1

0

0.37

0.2

0

0.18

0.56

0.09

1

0.2

0.09

0.2

0.33

0.6

1

1

0

0.15

0

0

0.5

0.46

1

0.21

0.21

0

0.36

1

0.6

0.33

0.17

0.21

0.17

0.21

0.46

0.15

0.36

0.31

0.14

0.11

0.17

0

0.21

0

0.35

0.33

0

1

0.5

0

0.21

0.35

0.09

0.25

0.15

0.46

0.43

Total:






19.6

18.96

15.36

17.42

16.52

25.02

Frequency:






0.36

0.35

0.28

0.32

0.31

0.46


The frequency of yours varies from 0.28 to 0.46 in this table. This shows quite clearly the uncertainty of the estimates or frequencies given by pooling or averaging of averages. Of course, five resamples are not much, as usually at least the number of informants is used as number of resamples, which would have been 54 in this case. This, however, is the downside of using Excel for bootstrapping. While it is easy to conduct, each resample has to be entered manually. (See, however, Rochowicz 2010, who provides a macro for Excel with which resampling can be automated.) To obtain a confidence interval for the variance between the 54 works in this database, the standard deviation of the frequencies in the resamples needs to be calculated. For the five resamples given in the above table, the standard deviation is 0.069.

Another relatively simple way of implementing bootstrapping with the help of Excel’s Index function is demonstrated at http://www.youtube.com/watch?v=flGyCXJ0Snc&feature=youtu.be.

As this paper deals with methods of handling small sample sizes, depending on the respective situation, the manual resampling method can prove useful. For larger research with more aspects to analyze, professional statistical software like SPSS is recommended, although there are several free alternatives which can be found here among others: http://statistiksoftware.com/

This page offers a lot of free software for statistical purposes and, moreover, a large collection of links that introduce uses to a range of applications of statistics. The access to those links for those interested in acquiring a profound knowledge of statistics can be found here: http://freestatistics.altervista.org/en/index.php

Created with the Personal Edition of HelpNDoc: Create HTML Help, DOC, PDF and print manuals from 1 single source