rnorm(5)
[1] -0.3932406 1.2852251 -0.5689218 -0.4123566 1.2358167
Statistical computing software like R
can generate (pseudo)random numbers, like this:
rnorm(5)
[1] -0.3932406 1.2852251 -0.5689218 -0.4123566 1.2358167
Or this:
rnorm(5)
[1] -1.9392724 -0.9563330 -1.3858488 -0.4422137 -0.4598596
Or this:
rnorm(5)
[1] -0.3931979 -0.8684721 -1.0429014 -1.6626016 1.3405364
Look pretty random to me. This allows us to perform simulations (eg. the bootstrap), which is an enormously important part of the modern statistician’s toolkit. Having said that, when you’re working with computer-generated random numbers, you want your work to be reproducible so that other people can check it. This means that you want to set a random number seed before you do a simulation. This ensures that the stream of random numbers in your simulation is the same every time, and someone else could run your code and get the exact same results that you did.
Setting a seed looks like this:
set.seed(8675309)
rnorm(5)
[1] -0.9965824 0.7218241 -0.6172088 2.0293916 1.0654161
Every time you run that code, you will get the same numbers:
set.seed(8675309)
rnorm(5)
[1] -0.9965824 0.7218241 -0.6172088 2.0293916 1.0654161
So, if you ever write a code chunk that generates random numbers (eg. using the generate
function), you should begin the code chunk by setting a random number seed so that you get the same results every time you run your stuff. The syntax as you saw above is set.seed(INTEGER)
. Sometimes we will tell you what number to put. Other times (and once you exit the course), you can put whatever you want. It doesn’t really matter. If you require inspiration, try these: