# Embedding a time series with time delay in R

I’ve recently been looking at [Martin Trauth](http://www.geo.uni-potsdam.de/member-details/show/108.html ‘Martin Trauth’s web page at The University of Potsdam Institute of Earth and Environmental Science’)'s book MATLAB® Recipes for Earth Sciences to try to understand what some of my palaeoceanography colleagues are doing with their data analyses (lots of frequency domain time series techniques and a preponderance of filters). Whilst browsing, the recurrence plot section caught my eye as something to look into further, both for palaeo-based work but also for work on ecological thresholds and tipping points.

In a recurrence plot, the recurrences of a phase space are plotted. As we tend not to have the phase space, just the time series of observations, we embed the observed series to produce the *m* dimensional phase space. A key feature of the recurrence plot is the *time delay* included during embedding. There is an `embed()`

function in R but it does not handle the time delay aspects that one needs for the recurrence plot, so I decided to write my own. The results are shown below in my function `Embed()`

. It has been written to replicate the standard R `embed()`

function where `d = 1`

(i.e. no time delay), which is a useful check that it is doing the right thing.

The arguments are:

`x`

: the time series, observed at regular intervals.`m`

: the number of dimensions to embed`x`

into.`d`

: the time delay.`as.embed`

: logical; should we return the embedded time series in the order that`embed()`

would?

On a simple time series, this is what we get using `embed()`

and `Embed()`

:

And here we have the results of embedding the same simple time series into 4 dimensions with a time delay of 2:

So what does embedding do? Without additional time delay, `embed()`

and `Embed()`

produce a matrix with `m`

columns containing the original time series and lagged versions of it, each column a lag 1 version of the previous column. Incomplete rows, that arise due to the lagging of the series with itself, are discarded. You can see this in the identical calls to `embed()`

and `Embed()`

shown above. There were 10 observations in the series, and we asked for 4 lag 1 versions of this series. Hence each of the series in the embedded version contains just seven observations; we loose three observations because the 2nd, 3rd, and 4th columns are progressively shifted by 1 time unit relative to the original series.

Time delay embedding allows for additional delay between the lagged versions of the original series. If `d = 2`

, then each of the `m - 1`

new series is lagged by 2 time intervals. This is shown in the final example above, with `Embed(1:10, m = 4, d = 2)`

, where the entries within the rows are offset by 2. However, the embedded series now contain just four observations.

How we use this to produce a recurrence plot will be covered in a separate post.