Drawing rarefaction curves with custom colours
16 April 2015 /posted in: R
I was sent an email this week by a vegan user who wanted to draw rarefaction curves using
rarecurve() but with different colours for each curve. The solution to this one is quite easy as
rarecurve() has argument
col so the user could supply the appropriate vector of colours to use when plotting. However, they wanted to distinguish all 26 of their samples, which is certainly stretching the limits of perception if we only used colour. Instead we can vary other parameters of the plotted curves to help with identifying individual samples.
At the frontiers of palaeoecology
31 March 2015 /posted in: Science
A couple of weeks ago, I had the pleasure of attending and participating in a symposium held to honour John Birks as he retires from the University of Bergen and becomes Professor Emeritus. The symposium, titled “At the Frontiers of Palaeoecology”, took place on 19–20th March in Bergen, Norway, and was a wonderful mix of colleagues old and new discussing John’s contributions to the field of palaeoecology and their collaborations with him. Alongside this reminiscing were several presentations describing new areas of research by colleagues and collaborators of John.
Harvesting Canadian climate data
14 January 2015 /posted in: R
In December I found myself helping one of our graduate students with a data problem; for one of their thesis chapters they needed a lot of hourly climate data for a handful of stations around Saksatchewan. All of this data was and is available for download from the Government of Canada’s website, but with one catch; you had to download the hourly data one month at a time, manually! There is no interface to allow a user of the website to specify the data range they want and download all the data from a single station. I figured there had to be a better way, using R to automate the downloading. Thinking the solution I came up with might help other researchers needing to grab data from the Government of Canada’s website save some time in the future, I wrote this post to document how we ended up doing it.
Analysing a randomised complete block design with vegan
03 November 2014 /posted in: R
It has been a long time coming. Vegan now has in-built, native ability to use restricted permutation designs when testing effects in constrained ordinations and in range of other methods. This new-found functionality comes courtesy of Jari (mainly) and my efforts to have vegan permutation routines use the permute package. Jari also cooked up a standard interface that we can use to drop this and some extra features neatly into any function we want; this allows us to have permutation tests run on many CPU cores in parallel, splitting the computational burden and reducing the run time of tests, and also a mechanism that allows users to pass a matrix of user-defined permutations to be used in tests. These new features are now fully working in the development version of vegan, which you can find on github, and which should be released to CRAN shortly. Ahead of the release, I’m preparing some examples to show off the new capabilities; first off I look at data from a randomized, complete block design experiment analysed using RDA & restricted permutations.
analogue 0.14-0 released
14 October 2014 /posted in: R
A couple of week’s ago I packaged up a new release of analogue, which is available from CRAN. Version 0.14-0 is a smaller update than the changes released in 0.12-0 and sees a continuation of the changes to dependencies to have packages in Imports rather than Depends. The main development of analogue now takes place on github and bugs and feature requests should be posted there. The Travis continuous integration system is used to automatically check the package as new code is checked in. There are several new functions and methods and a few bug fixes, the details of which are given below.
Simulating species abundance data with coenocliner
31 July 2014 /posted in: R
Coenoclines are, according to the Oxford Dictionary of Ecology (Allaby 1998), “gradients of communities (e.g. in a transect from the summit to the base of a hill), reflecting the changing importance, frequency, or other appropriate measure of different species populations”. In much ecological research, and that of related fields, data on these coenoclines are collected and analyzed in a variety of ways. When developing new statistical methods or when trying to understand the behaviour of existing methods, we often resort to simulating data with known pattern or structure and then torture whatever method is of interest with the simulated data to tease out how well methods work or where they breakdown. There’s a long history of using computers to simulate species abundance data along coenoclines but until recently no R packages were available that performed coenocline simulation. coenocliner was designed to fill this gap, and today, the package was released to CRAN.
Allaby M. et al. (1998) A Dictionary of Ecology, second. Oxford University Press.
Simultaneous confidence intervals for derivatives of splines in GAMs
16 June 2014 /posted in: R
Last time out I looked at one of the complications of time series modelling with smoothers; you have a non-linear trend which may be statistically significant but it may not be increasing or decreasing everywhere. How do we identify where in the series the data are changing? In that post I explained how we can use the first derivatives of the model splines for this purpose, and used the method of finite differences to estimate them. To assess statistical significance of the derivative (the rate of change) I relied upon asymptotic normality and the usual pointwise confidence interval. That interval is fine if looking at just one point on the spline (not of much practical use), but when considering more points at once we have a multiple comparisons issue. Instead, a simultaneous interval is required, and for that we need to revisit a technique I blogged about a few years ago; posterior simulation from the fitted GAM.
Identifying periods of change in time series with GAMs
15 May 2014 /posted in: R
In previous posts (here and here) I looked at how generalized additive models (GAMs) can be used to model non-linear trends in time series data. In my previous post I extended the modelling approach to deal with seasonal data where we model both the within year (seasonal) and between year (trend) variation with separate smooth functions. One of the complications of time series modelling with smoothers is how to summarize the fitted model; you have a non-linear trend which may be statistically significant but it may not be increasing or decreasing everywhere. How do we identify where in the series that the data are changing? That’s the topic of this post, in which I’ll use the method of finite differences to estimate the rate of change (slope) in the fitted smoother and, through some mgcv magic, use the information recorded in the fitted model to identify periods of statistically significant change in the time series.
Modelling seasonal data with GAMs
09 May 2014 /posted in: R
In previous posts (here and here) I have looked at how generalized additive models (GAMs) can be used to model non-linear trends in time series data. At the time a number of readers commented that they were interested in modelling data that had more than just a trend component; how do you model data collected throughout the year over many years with a GAM? In this post I will show one way that I have found particularly useful in my research.
File synchronisation with Unison
25 March 2014 /posted in: Computing
It’s becoming a fairly common experience to work on two or more computing devices; say a desktop/workstation in the office and a laptop when travelling or a home desktop. Which is great, but how do you keep all those machines in sync so that you have the latest versions of your files available no matter where you need to work?