30:00
Robert Schlegel
We could extol the virtues of R all day, but let’s rather see for ourselves why R is one of the preferred tools for science.
In this exercise we will be opening a dataset in MS excel and giving ourselves 30 minutes to perform a basic analysis and visualisation.
Gasp! Yes I know. After all of that and now we are using MS Excel? But trust us, there is method to this madness.
The dataset we are using comes from the NOAA OISST remotely sensed seawater temperature product. It contains time series for pixels from three different regions, each 39 years long.
Monthly climatology: the average temperature for a given month at a given place.
So in this instance, because we have three time series, we will want 39 total values comprised of January - December monthly means for each site.
E.g. If a time series is 39 years long, then a climatological December will be the mean temperature of all of the 39 Decembers for which data are available.
Dot and line plot: A dot for each monthly climatology, connected by a line.
Once the monthly climatologies per site have been calculated, it should be a relatively easy to visualise them as a dot and line plot.
NB: We want to create one dot and line plot for each site
Your mission, should you choose to accept it:
sst_NOAA.csv
in MS ExcelYou will have 30 minutes starting now…
30:00
Using excel may allow one to make small changes quickly, but rapidly becomes laborious when any sophistication is required.
Now that you’ve had some time to look at the data and work through the exercise in Excel, let’s see how to do it in R.
The few lines of code below make all of the calculations we need in order to produce the results we are after.
The first step in an R workflow (more on this later) should be to load your libraries and then your data.
After which we perform our anlayses.
And finish up with visualisations.
Which may be iterated on (more on this later).